Remember the game Labyrinth? It consisted of a lap-sized wooden maze set in a box. Using knobs on the box, you could tilt the maze on two axes and thereby maneuver a steel marble through the maze. The trick was to steer the ball around a series of holes along the route.
Modernizing the IRS, or any agency’s IT infrastructure, is a little like Labyrinth. One hole IRS hasn’t gotten around is the Individual Master File (IMF). Coded in assembly language 59 years ago, the IMF seems small, at 200,000 lines of code. But the nature of assembler is that each line does a lot. Somehow replicating the logic in a more contemporary language has resisted efforts dating back a couple of decades. That in turn has delayed many service improvements and operational efficiencies the IRS has wanted to make.
When you read stories stating the IRS has “systems from the Kennedy administration” they’re referring to the IMF. In reality the IRS hardware infrastructure is quite modern. They have new computers running old code, probably with some translation layer since assembler was designed to access machine instruction sets directly. Still, the IMF code could run indefinitely, but for two reasons. One, the population of programmers able to maintain it is aging and thinning out. Who knows, the biggest remaining group of assembler people might be right in the IRS. Two, the code blocks IRS efforts to deploy a variety of digital services and to improve customer experience. You need modern code for modern service.
According to its 2019, and still current, modernization blueprint, retirement of the IMF is a crucial project. IRS wants to replace it with what it calls State 2 of the Customer Account Data Engine, or CADE-2. It will, the plan states, “allow direct visibility and access to taxpayer account detail on a near real-time basis and furthers the overarching effort to retire the IMF.” More precisely, a major component of the CADE-2, the Individual Tax Processing Engine, or ITPE, will result from the conversion of those 200K lines of assembler language code (ALC).
Insight by ThunderCat Technology and Dell Technologies: Learn different ways agencies are taking more advantage of AI and ML tools to help exceed mission expectations by downloading this exclusive e-book.
In columns in 2018 and in January of this year, I detailed a methodology developed by a former IRS manager that could convert the assembler to Java. The IRS went on to obtain a patent on Jian Wang’s process, and Wang himself received recognition from then-IRS Commissioner John Koskinen. It was supported by then-IRS Chief Information Officer Terry Milholland.
Yet the IRS never deployed Wang’s methodology. Having followed tax system modernization since the early 1990s, I wanted to know why. After two years of asking, I obtained an interview. The IRS provided Darrell White, an IRS senior executive who is in charge of CADE-2.
White had praise for what I’ll call the Wang Methodology. “The conversion technology was really genius, I mean, incredible innovation. I want to acknowledge that,” he said.
But, White said the resulting Java code would not, in the modernization officials’ estimation, do what IRS needed it to do.
“It would run. It [would be] functionally equivalent, it would do the same thing,” White said. However the code would not have the characteristics of natively developed Java. It would look opaque to Java programmers, because it would not have the underlying architectural principles “that just didn’t exist when ALC was created.”
White added, “The Java that was created by the automatic tool [Wang’s process] looked a lot like the ALC. However, it’s not the kind of thing where we could hire a journeyman Java programmer off the street, and give it to him or her and and have that person be able to really understand what it was doing.” IRS, he said, in effect would have replaced code it can still deal with, with code no one could.
Instead, White told me IRS is using a “logic harvesting” approach that will give it Java that Java programmers can easily understand and update. And Java that meets overarching program goals such as giving customer service agencies a single view into everything connected with a given taxpayer.
The process White described relies heavily on contractors, including returning retirees. Code block by code block, teams identify what it’s doing and document the rules. Then it gets passed to an architecture and design team that includes “very senior Java developers. They’ll look at the requirements. They understand our overall approach for how we’re building the Java product. And then they will do the design for those requirements.” Then the packages of requirements move on to implementation teams to perform the Java coding itself.
It’s a hybrid of agile and waterfall. The agency isn’t taking the tax law and coding it from scratch. It is extracting logic and recoding in increments. White said the project also comes with a technical framework necessary to implement certain requirements. “It’s not simply conversion of the ALC. That’s where the business rules and requirements are. We also have to build those foundational components for the Java product. We call those technical enablers. It’s another component of the workload that must be completed.” For example, utilities that connect the execution of business rules to the database.
White said the team is close to 60% through code conversion, but has a significant period of testing ahead. “And,” he said, “how long that testing will take is a bit of an unknown.” He said they’re working towards completing the conversion by April — of 2023. Testing is happening now. The testing plan includes parallel validation, that is, running the old and new systems at the same time and comparing the results. The program, White added, will move from mostly coding and a little testing, to a little coding and lots of testing.
Wang had said his methodology would have finished the job in two years using 15 people. The official position of IRS is, it needed a more robust approach. The current plan runs six years, until 2024. It’s a more traditional government approach, with lots of contractors, lots of dollars, long horizons and major costs. The IRS’s credibility is on the line. It’s has foundered on the rock of the IMF before. The Treasury Inspector General for Tax Administration is looking at the ALC replacement. A senator asked Commissioner Chuck Rettig about it — citing my columns.
So everyone agrees on the need to retire the IMF.
Ironically, the way the agency has ended up approaching the task, while not using the Wang Methodology, is nevertheless home-grown. I asked White if there was a commercial product for logic harvesting. He said, “We looked at a number of them. At the end of the analysis, we designed and built our approach for how we’re going to do logic harvesting that’s really tuned to the particular challenge in IMF.”
Indeed it has been a challenge.