Release 4.0

Big announcement - edgartools 4.0 was released.

Actually it was released on April 26th, 2025, and there have been 2 minor releases since then. So there was less fanfare than there should have been and I'm trying to catch up with this post.

The primary goal of Release 4.0 was to finalize the rewrite of XBRL functionality in the library and get more accuracy and reliability of financial data. A lot of the actual work was done in a parallel implementation called XBRL2 in the version 3 stream, so the new functionality was available and being tested for quite some time. This means that in 4.0, XBRL2 simply became the core XBRL functionality, with less fanfare than if it had been a major cutover. I think it has met most of the goals I had for it, although there are still a few things I noticed that could be improved.

The XBRL changes lays the foundation for features derived from financials such as financial metrics. Actually basic metrics are implemented in Release 4.0, but we need to spend more time to get it polished and rock solid reliable, probably for a 4.1 release. The challenges here are the limitations of my own understanding - I know a lot about financials but I am not an expert.

The value of experts

This is where I will rely on experts - in an interesting way. The XBRL rewrite was mostly vibe coded. What that means is that I relied heavily on LLM coding tools to generate most of the code under my supervision. But most of the intelligence the AI tools brought to the process was distilled from its training on information that human experts produced and collated by the AI companies in the training data. I also used the AI tools to generate Technical Plans and Code Quality Plans for Release 4.0 and the artificial intelligence was again learnt from how humans do these things effectively. Eventually I want to harnessing AI experts to do these sort of tasks, and doing Release 4.0 was a really good training ground for me.

Actual human experts are even more important. With AI you get the smushed average of millions of sources. With humans you get an angle, an opinion a point of view. With AI you get confident answers. With humans you get questions that provoke an insight. AI's don't desire. Humans do. So when someone creates an Github Issue on why statements generated from XBRL are different from that in filings online you know there is an actual need behind the question, and finding a solution involves getting some surprising insight.

What's next

So Release 4.0 and future releases will go beyond code to produce documents - packed with information about how XBRL and financials workload and should work. So I have been using AI to generate documents that describe how parts of the API like Ownership work, but also design specifications like the High Level XBRL Parser Design. These are meant to accelerate AI assisted development but I think we can go further by pulling in human discussions from GitHub issues and discussions to the design.

Currently I manually read and comment on the discussions and then code improvements and fixes, but maybe there's a better way, like running actions to read and pull information from discussion threads into design specifications. That's what's next - exploring how we use and produce information from SEC Edgar filings and what's involved in working with them.

About edgartools

edgartools is a library for extracting data and insights from SEC Edgar filings.

You can install using pip install edgartools

Visit the GitHub repo and give it a star if you find it useful.