MyUW Architecture and System Description

MyUW® is the Web portal of the University of Washington. It provides a personalized view of UW's Web resources for anyone who has a UW NetID. Based on users' affiliations with the UW, it anticipates what they may need by providing a customized set of resources (the initial view), which can be further personalized by users. Users may add their own favorite links on any MyUW page and select the appearance style.

The MyUW system is designed to provide high availability, performance, scalability, and maintainability at low cost. In this documentation, the specifics are covered from the following four perspectives:

  1. The Application Servers
  2. The Database Servers
  3. The Contents Presented Through MyUW
  4. The Supporting Utilities

1. The Application Servers

Bearing these design goals in mind, the configuration of the application servers also reflects the principle of simplicity. The total seven identically configured Intel boxes running Linux operating system are divided into three groups:

  1. A cluster of four servers for production (in two separate locations)
  2. A cluster of two for testing
  3. One for development

Note: The MyUW system used to have fifteen IBM RS/6000 machines running IBM's AIX operating system. Thirteen production servers were divided into three Web servers, eight application servers, and two content servers. This configuration had an un-necessary overhead during system trouble shooting: It is hard to remember the specific role of a host. When the system migrated to Intel boxes in July 2005, the role of all the production servers was made identical.

Sufficient redundancy is built in the production cluster to prepare for disasters such as an earthquake. Individual server maintenance (i.e., upgrading the operating system, Web server, or application server) is done by taking the host out of the server cluster and putting it back afterward, thus there is no interruption of user experience.

Each box runs one Apache and two Jakarta Tomcat application servers as the servlet engine/container. Since MyUW is the mission critical application that takes the most usage, Tomcat1 is designated for MyUW. Tomcat2 contains five applications in the MyUW Calendar suite, MyUW eCard, and three applications that supply personal content on MyUW (including Alumni Information, Calendar, and Housing & Food Services channels). Each Tomcat server may consume up to 1.5 GB maximum amount of memory, where the total RAM is 4 GB.

The Apache server has the following plug-in modules:

The portal software was developed in house using Java and a few open source libraries such as Jakarta Commons HttpClient, Rome Fetcher and RSS/Atom syndication tool, Apache Element Construction Set, etc. The Web pages use HTML (with minimum JavaScript) and Cascading Style Sheets. A few small utilities that require no dependency on Tomcat (such as the Contact MyUW page) are written in CGI/Perl.

MyUW is designed to work on any Web client that supports cookies and HTTPS. Cookies are used by the Pubcookie authentication process and by MyUW for storing a user session identifier. The Web content in a user session can only be accessed via HTTPS, whereas a guest view of MyUW content can be accessed via HTTP.

A user session is created when a user logs on to MyUW with their UW NetID and password. It is invalidated when the user logs out of MyUW or when it is timed out by the servlet container. Whenever the user accesses a MyUW servlet page, the servlet always validates the session first. If the user has logged out of MyUW, they must re-enter their UW NetID password to get back into MyUW. This prevents others from viewing the page if the user forgets to exit from the browser. If the session is timed out but the Pubcookie is still valid, re-entering the password is skipped.

The MyUW servers, MyUW database, and the network connection are the critical dependencies of MyUW system.

2. The Database Servers

MyUW Database

The data particular to MyUW is stored in an Informix database. The database servers run on three hosts: a development server, and two production servers (a primary server and a secondary server kept in warm standby via log shipping for fail-over purposes).

MyUW connects to the database hosts via a backdoor network and accesses the database using the Informix JDBC driver with connection pooling. Currently only the userid and password for authentication are encrypted. As MyUW sends hundreds of thousands database requests daily during a peak load period (quarter start), the database Stored Procedures are used to minimize the response time of database requests.

Whatami Database

MyUW also accesses a centralized user information database, called Whatami, via a backdoor network. Initially connections were not encrypted. In early 2008, security was strengthened by using TLS (Transport Layer Security) and SSLSocket connection, where the Whatami Mango server also authenticates MyUW via a client certificate issued by the UWCA.

MyUW retrieves two types of information from Whatami:

  1. The user identity information to set the user's initial view in MyUW
  2. Other database system keys to retrieve personal content from other back-end services

The retrieved Whatami data is then cached in the MyUW database so that MyUW can continue providing service when Whatami is down.

3. The Contents Presented Through MyUW

Personal Contents

MyUW gets a variety of personal contents from other UW departments. (See the specification on how to write a Web service to publish personal content on MyUW for more information.) These include:

Fetching personal contents can add a significant delay to the MyUW page's overall response time, depending on the response time of the source service. To improve the efficiency of MyUW's process, in late 2007 we implemented multi-stage thread pooling with underlying HTTP connection pooling using Jakarta Commons HttpClient. The sizes of the thread pools are pre-configured based on experimental data. The host-based HTTP connection pools are pre-configured individually based on the traffic to the host.

Atom and RSS Feed Contents

MyUW currently has more than sixty RSS content feeds. In Mid-2008, we extended the support to cover both Atom and RSS feeds (more information can be found on the specification on MyUW's Atom and RSS syndication feed format) using the Rome Fetcher and RSS/Atom syndication tool. The retrieved content is cached locally at two-levels:

  1. A disk cache of the raw content, which is validated based on the ETag in the HTML header of the feed. If ETag is not supported, the disk cache is refreshed at each service interval.
  2. A memory cache of the final HTML content fragment, which is validated based on the top level pubDate/updated time in the feed. If the pubDate/updated element is not presented, the memory cache is refreshed at each service interval.

Other Dynamic Contents

About a dozen HTML files or images (such as the Student Guide Headlines, the Time Schedules, the Weather Information, and Web-cam images of the Seattle, Bothell, and Tacoma campuses) are gathered from other UW Web pages/sources by a standalone utility program (described in section 4). This content also has two-level caching:

  1. The corresponding disk cache is maintained by the utility program.
  2. The memory cache is maintained the same way as it is for the Atom/RSS content.

User Editable Contents

A few content sections (with an Edit button on the section heading) contain links to the most commonly used UW Web resources, which users can select to turn on or off.

User's own links

Users may add their favorite links on any MyUW page. These links are stored in the MyUW database.

4. The Supporting Utilities

Here is an incomplete outline of the standalone supporting utility programs/scripts:


Last updated: 7/9/2008
UW Technology logo
Contact MyUW Contact MyUW|UW Home
© 2009 University of Washington
MyUW® is a registered trademark of the University of Washington.