Enterprise Registration & Data Warehouse
1. Context
In 1999, Shippensburg University's Academic Affairs division operated largely on data it could not interrogate. Student records lived on a mainframe. Enrollment management — the process of understanding who applies, who enrolls, who persists, and who doesn't — was driven by intuition and reports that arrived weeks after the decisions that would have benefited from them.
The timing mattered. Pennsylvania's state university system was under demographic and budgetary pressure. Enrollment was strategic. Institutions that understood their enrollment dynamics could act on them. Institutions that did not were managing by lag.
2. The Problem
The university had data. The data was inaccessible.
The mainframe stored student records with high fidelity but provided no path for analytical access. Standard queries required custom mainframe programs and days of elapsed time. Enrollment management staff were making admissions and retention decisions without the information that would have changed those decisions.
The secondary problem was the portal gap. Students, faculty, and staff had no self-service access to their own records. Academic advising, degree planning, registration status — all of this required in-person visits to offices that maintained the information manually.
3. Why Existing Thinking Failed
The existing model treated the mainframe as the terminal system. Reports flowed out of it; nothing flowed back in analytically. The data was treated as a ledger, not as evidence.
Commercial ERP solutions were beginning to emerge (PeopleSoft, Datatel), but implementing one required capital investment and multi-year transition timelines that were unavailable. The strategic need was immediate.
The portal problem was unsolved because no one had connected the technical capability — a LAMP stack, Oracle or MySQL, basic web authentication — to the operational problem. The tools existed; the architectural vision did not.
4. My Approach
Build the analytical layer independently of the mainframe. Rather than replacing or extending the mainframe system, create a parallel data infrastructure that extracted and transformed mainframe data on a scheduled basis, loaded it into a relational database, and made it available for reporting, analysis, and modeling.
The enrollment warehouse was designed around the questions Academic Affairs actually needed to answer: Which applicant populations convert at what rates? Which student cohorts show early retention risk indicators? Where does the funnel lose students between inquiry and enrollment?
The web portals were designed around the same principle: connect existing institutional data to the people who needed it, through access paths that required no training and no in-person intermediary.
5. Technical Solution
The enrollment management data warehouse aggregated and archived enrollment statistics extracted from the mainframe:
- Scheduled extraction, transmission, and conversion of mainframe records to Oracle
- Transformation logic handling the data model differences between mainframe formats and the analytical schema
- Reporting layer producing strategic enrollment management reports for Academic Affairs
- Machine learning models applied to enrollment data to support targeted admissions and retention initiatives — identifying student populations at risk before they acted on that risk
The integrated academic information system (info.ship.edu) provided secure self-service web portals for students, faculty, and staff. The system ran on Linux and Solaris, with Oracle and MySQL backends, and served the university for nearly a decade — built at a time when self-service web portals for higher education were not standard.
The back-end infrastructure included mainframe data transfer bridges, network backups, and database maintenance — all operated with a small team responsible for the full stack from hardware through user interface.
6. Organizational Challenges
The project required simultaneous participation in university-wide strategic planning teams: enrollment management, academic dean's and directors, and the university website team. The technical work was not separable from the organizational work — because the value of the warehouse depended entirely on whether Academic Affairs would use it, and using it required trust that the data was accurate.
Building that trust meant demonstrating fidelity first: every report needed to match what administrators already believed to be true before introducing insights they had not previously had access to. A warehouse that challenged existing beliefs before establishing credibility would be disregarded.
Supporting Academic Affairs strategically — not just building systems for them — meant understanding enrollment management as a domain, not merely as a data problem. The machine learning models for targeted admissions were only valuable because the people using them understood what the predictions meant.
7. Outcome
Academic Affairs gained ongoing analytical access to enrollment data that had previously been locked in the mainframe. The enrollment warehouse enabled evidence-based planning for admissions and retention — years before most institutions of comparable size had equivalent capability.
The self-service portals reduced in-person office traffic and gave students, faculty, and staff access to their own records. info.ship.edu continued serving the university for nearly ten years.
The machine learning applications to enrollment management — predicting conversion rates, identifying retention risks — represented early applied ML in a higher education context, at a time when such applications were uncommon at institutions this size.
8. Lessons That Generalize
Information becomes strategic when it is connected. Data locked in a mainframe is not information — it is potential. The same data, accessible at the speed of a question rather than the speed of a report request, changes how decisions get made.
Build the analytical layer before replacing the source system. The instinct is often to fix the source first. But the source system is usually working fine for the people who built it. Adding an analytical layer alongside the source — extracting, transforming, and making data queryable — delivers strategic value without disrupting operational continuity.
Trust is a prerequisite for insight. A data warehouse that produces results no one trusts will not change decisions. The path to analytical influence runs through demonstrated accuracy on things already known, before introducing things not previously known.
Early ML applications have higher learning value than precision. Applying machine learning models to enrollment data in 1999 produced models that were imprecise by any current standard. They were also ahead of anything peer institutions were doing, and the experience of building and interpreting them compounded over decades. Early exposure to a powerful technique creates long-term capability even when the early results are imperfect.
Related: Principle 1 — Build Systems, Not Components · Principle 2 — Reduce Friction Before Increasing Effort · Mental Models — Friction Mapping · Engineering — The Infrastructure-Before-Mandate Pattern