CSA Developer's Guide

Overall Design

The clickstream data aggregation (tracking) functionality interface consists of a single session-scoped JavaBean (com.ths.csa.TrackedSession) and a demonstration JSP file (SessionTracker.jsp). When the JavaBean is bound to a session and the initialize() method has been called, the bean begins tracking various events that occur in the user's session via various methods in the interface. These events include page views, timeouts, logouts, errors, and user defined events. When the session is closed, the contents of the session are written to a unique XML file. The filename will be the user's session ID and the path is specified by an argument to TrackedSession's initialize() method. The tracking functionality may be used by itself, though the reporting functionality is dependent on the tracking functionality.

The reporting functionality interface of CSA consists of a single application-scoped JavaBean (com.ths.csa.ClickStreamAnalyzer) called by a single JSP file (ClickStream.jsp). An initial call to ClickStream.jsp will generate a form with all available folders containing tracked session XML files. The default ClickStream.jsp displays the folders in a date-wise hierarical manner based on a YYYY/MM/DD directory structure. When the user selects some number of folders and submits the form, ClickStreamAnalyzer's doAnalysis() method is called. All the session XML files in all the selected folders are then parsed and summarized. Because this is a CPU and I/O intensive process, it is suggested that the reporting tool be installed on a non-production server with network access to the filesystem containing the session XML files. doAnalysis() returns a CSAResultSet object from which summary values are pulled and displayed.

All class files for both the tracking and reporting functionality are included in the CSA.jar file.

Installation

  1. Unzip the download file into a temporary directory.

  2. Move the CSA.jar file to an appropriate place in your directory structure for 3rd party JAR files.

  3. Add the CSA.jar file to the classpath of the servlet engine's VM.

  4. Move the JSP files in the /jsp folder to the location of your site's JSP files.

Tracking Integration

Reporting Integration

Very little integration work is necessary to begin using the reporting tool once the tracking system is in place. As mentioned before, the reporting tool requires the XML files generated by the tracking tool, though the tracking tool may be used without the reporting tool. The process of reading and parsing a large number of XML files will be taxing on the hosting server both CPU-wise and I/O-wise. For best results, host the reporting tool on a non-production server with access to the filesystem containing the XML. Because the report must be generated before the user's browser timesout waiting for the response, more powerful host servers will allow larger numbers of tracked sessions to be included in a requested report. A feature is currently under development to allow the report to be e-mailed to the requester, thereby allowing very large numbers of tracked sessions to be included.

Integration with analysis engines via XML

CSA writes out tracked session data to XML files. These XML files can be easily imported into databases of high-end inference and rules based analysis engines. There currently is no DTD published for the XML file formats, though each will contain one 'session' root element with various attributes, plus some number of 'event' sub-elements also with various attributes.

Modifying and Extending Functionality

CSA was designed to allow for most required modifications to be made at the JSP level. Though extending TrackedSession and ClickStreamAnalyzer is certainly an option. Please make contact if you need any assistance in that area.

Feature Requests and Bug Reports

Please don't hesistate to file feature requests, bug reports, or any other comment from the contact page.