A Conversation with the Creators of Indexia

15 min read

Welcome, friend. All those years ago, when I was writing research papers, one of the first places I would turn in a book was the index. Before committing to hundreds of pages, I wanted to know whether the book discussed the concepts I was researching. An index could tell me a surprising amount about a book before I had read a single chapter. Nowadays, it’s the cookbooks whose indexes I rely heavily on to find recipes.

Today I am chatting with Will Dinneen and Ben Vagle, the creators of Indexia, an AI-powered indexing tool designed to help authors create indexes faster, more affordably, and with greater editorial control. Our conversation explores the hidden art of indexing, the challenges of teaching AI to map knowledge, and why a great index can reveal as much about a book as the book itself.

Let’s welcome them to Armed with A Book.

A Conversation with the Creators of Indexia

Hi Will and Ben! Welcome to Armed with A Book. Tell me and my readers a bit about yourself.

Thanks for having us, Kriti!

We’re both JD candidates at Stanford Law School. Before law school, Ben worked at the U.S. Treasury Department on the Committee on Foreign Investment in the United States (CFIUS) and co-authored a book on U.S.-China economic competition, Command of Commerce, published by Oxford University Press. Will studied international studies and history at Emory, then worked as a data scientist at the University of Pennsylvania’s Development Research Lab, where he applied computational methods to political science research.

We came to Indexia from very different angles — Ben from the author’s side, having gone through the pain of indexing his own book, and Will from the data side, seeing it as a fascinating problem in knowledge extraction. But we shared a deep interest in how technology can make academic publishing more accessible.

Indexia launched in November 2025. What is it about?

Indexia is an AI-powered tool that creates back-of-book indexes. You upload your manuscript, and our system reads through it, extracts the key concepts, names, places, and themes, maps the relationships between them, and produces a structured, navigable index — typically within about thirty minutes.

The traditional way to get an index is to hire a professional indexer, which costs anywhere from $1,000 to $3,000 and takes several weeks. That’s a real barrier, especially for first-time authors, independent scholars, and smaller publishers on tight budgets and tighter timelines. Indexia brings that cost down to $99 for a standard index, and gives authors an interactive editor so they can review, refine, and reshape the result before exporting it in formats like Word, LaTeX, or even CINDEX-compatible XML.

We think of Indexia as a collaborative tool. The AI does the heavy lifting of the initial extraction and organization, and the human brings their judgment in for the final draft.

What problem in indexing first made you think, “There has to be a better way to do this”? How would you describe the moment of its spark — one filled with frustration, curiosity or something else?

Honestly, it was personal frustration. Ben had just finished writing his own academic book and reached the stage of the publishing process where he needed an index. He quickly discovered that professional indexing costs thousands of dollars and takes weeks, and that the only alternative was to do it himself, which meant reading his entire book again with a spreadsheet open, manually logging every concept and page reference, and deciding how the final index should be structured.

Ben was just a graduate student looking at a deadline and thinking, “I just spent two years writing this book. There has to be a faster way to do this.” That frustration turned into a prototype, then, with Will’s help, the prototype turned into something that actually worked as a product, and once other authors started using it and telling us how much time and money it saved them, we realized we’d stumbled onto a real gap in the market.

Before building Indexia, what was your personal relationship with indexes — as readers, writers, and/or scientists?

As readers, we were probably like most students.We used indexes constantly, but we never thought much about them. You flip to the back of a textbook to find a concept or you check whether a topic you’re researching is covered in a book, then you continue on.

As writers and researchers, though, you start to see how much craft goes into a good index. When Ben was writing Command of Commerce, he realized how indexes do more than just list terms, they also organize concepts, and provide a curated and concise map of what the book is about. Professional indexers treat high-quality indexes as an artform: how cross-references are handled, what level of granularity to use, whether to indexed arguments or just topics. You start to realize that an index is also an interpretation of the book.

Will’s had also used indexes quite a bit while writing his history thesis, but with a background in data science he also saw indexes through a different lens: as a knowledge extraction problem. How do you take 300 pages of dense academic prose and distill it into a useful structured map of ideas? That’s fundamentally a classification and relationship-mapping challenge, which is exactly the kind of thing natural language computation is designed for. At UPenn, he’d used AI for his own research (e.g. mapping policy discussions across thousands of legislative documents) and he realized that similar methods might work with indexing.

What makes a *great* index in your view?

A great index helps you discover things you didn’t know to look for. Armed with an index, a reader should have a basic idea of what topics are discussed and how those topics relate to each other. The best indexes have a kind of intellectual generosity to them. In addition to listing topics, they surface surprising connections and map out the themes that run quietly through the whole book.

Practically speaking, a great index has a few hallmarks: consistent terminology that matches how the author actually uses concepts, a clear hierarchy so you can drill from broad topics to specific subtopics, thoughtful cross-references that connect related ideas, and accurate page references. The Chicago Manual of Style has excellent guidelines on this, and we’ve built those standards directly into Indexia’s pipeline.

But the real test of a great index is whether a reader can pick up a book they’ve never read and, just by scanning the index, get a sense of what the book is actually about. If the index reads like an intellectual outline of the work, it’s doing its job.

Indexes quietly guide readers through a book’s ideas. Is there a book whose index — or structure — changed the way you approached knowledge or research?

For Ben, it was actually the experience of working on Command of Commerce. The book argues that conventional measures of economic power — like GDP — dramatically understate America’s advantage over China, and that you need to look at things like corporate profits, technological dominance, and financial leverage to see the full picture. Building the index for that book forced him to confront the structure of his own argument in a new way. You realize which concepts are truly central and which are peripheral. It’s like reverse-engineering your own thinking.

For Will, he enjoys perusing the indexes of history books the most. A good index can give you a huge amount of information about a book, and they’re often stimulating to read in their own right. For example, take this excerpt from the index of the book Once Within Borders by Charles S. Maier:

Railroads and telegraphs: space and time, effect on, 189; commercial interconnectivity, 189–190; effect on collective loyalties, 190, 344n8, 344n9; transformative impact, 193–194, 288; state subsidies for land and construction of, 195–197; national territory and, 197; slavery and, 198; . . . ‘railway imperialism,’ 208, 350n63; . . . military strategy and, 213; . . . global domination, fear of, 214; . . .”

Even with just this tiny excerpt a reader knows that railroads and telegraphs will be discussed extensively throughout the book, and a reader can even make an educated guess as to what the argument might be: railroads and telegraphs, through their transformative effects on commercial interconnectivity, changed people’s collective loyalties, ultimately becoming an important tool for nation building, territorial expansion, and military strategy. This density makes indexes endlessly fascinating.

Is indexing more of a technical task or an interpretive one?

Both, and the balance is what makes it interesting. On its face, indexing is a technical discipline. There are consistent conventions to follow, handling names and titles correctly, formatting cross-references, making sure page ranges are accurate. The Chicago Manual of Style has specific rules about all of this, and getting them right matters for usability.

But the interpretive dimension is what separates a mechanical keyword list from a real index. An indexer has to decide: Is this concept important enough to include? Is it a main entry or a subentry under something broader? When the author uses three different terms for the same idea across different chapters, which one becomes the canonical entry? Those are intellectual judgments, not just technical ones, and they require an intimate understanding of how an index works and of the particular content of a book. It’s extremely intellectually demanding.

This is actually core to how we’ve designed Indexia. The technical parts — extraction, page reference verification, formatting compliance — are where AI excels. It can process an entire book systematically and consistently in ways that are genuinely hard for a person working with a spreadsheet. But we’ve built the system so that the interpretive judgments remain accessible to the author. After Indexia’s AI has produced an initial index, the user can merge terms, restructure hierarchies, add or remove entries, and shape the index into something that reflects their understanding of the work. Ultimately, whether an index is “good” or “bad” is just as subjective as whether a book is “good” or “bad”. An index (or a book) can be accurate, it can be inaccurate; it can be comprehensive, it can be concise; it can be creative, it can be straightforward; and, for any given reader with any given goal, it can be useful or not.

What are some common mistakes people make when creating indexes?

The biggest one is probably over-indexing or under-indexing concepts. Including every mention of a term rather than focusing on substantive discussions, or including every possible term, is a sure way to go over your publisher’s page limit. A good index entry should point readers to where a concept is meaningfully discussed, not just mentioned in passing. On the flip side, under-indexing is just as common. Authors creating their own indexes tend to index books the way they think makes sense.. But, having written the book, they struggle to imagine the perspective of someone who hasn’t read the book, and so they can miss threads that new readers might need.

Where does AI shine in indexing, and where do you still see human judgment as essential?

AI shines in the exhaustive, systematic parts of the work. It can read every page of a 400-page book and identify potential index terms without fatigue, without losing focus in chapter twelve, without forgetting that a concept mentioned in the introduction comes back in the conclusion. It’s also very good at detecting relationships between terms.

Our system uses thousands of focused AI calls rather than one big prompt, which means it can build deep context about how each concept is used throughout the book. It generates summaries at the page, section, and book level, then uses those to inform its extraction and relationship-mapping. That layered approach catches things that even a careful human reader might miss on a single pass.

Where human judgment remains essential is in the editorial layer. Is this concept important enough to deserve its own entry, or is it background noise? When the AI finds fifteen mentions of “democracy,” does the author want that as a major entry with subentries, or is it ambient context for a book that’s really about something else? What’s the right level of granularity for this particular book’s audience?

That’s why we’ve built Indexia as a collaborative tool rather than a fully automated one. The AI generates a comprehensive first draft, and the author or editor shapes it into something that reflects their intent.

Indexes are essentially maps of ideas. What does building an AI index teach you about how knowledge is structured? Did it change the way you think about how humans organize knowledge?

It really did. One of the most fascinating things about building Indexia was discovering how differently the same book can be indexed depending on what you prioritize. Knowledge is a network, and an index is one possible path through that network. The same book could have an index organized around people, or concepts, or chronology, or arguments, and each would be “correct” but reveal different things about the text.

What surprised us is how much structure is implicit in good writing. When you run AI extraction across a well-written academic book, the hierarchy almost emerges on its own. The AI isn’t imposing an organization; it’s surfacing one that was already there.

That said, it also revealed how messy knowledge organization actually is. Authors use different terms for the same idea. They revisit concepts in new contexts where the meaning has shifted. They draw connections that are implicit rather than stated. Handling all of that gracefully is what makes indexing genuinely hard, for humans and AI alike.

What surprised you most when you first started training AI to recognize index-worthy concepts?

The biggest surprise was how bad the naive approach is, and how good a purpose-built approach can be. If you take a book and ask ChatGPT or Claude to “create an index,” the result is — to put it charitably — not usable. The AI will hallucinate page numbers, miss major concepts, invent entries for things that aren’t in the book, and produce something that looks like an index but doesn’t function as one. The American Society for Indexing tested this approach and found that generic LLMs produce only 20-40% of the access points a professional indexer would include.

But that’s because prompting a general-purpose AI to “make an index” is like asking someone to paint a house by handing them a bucket of paint and no brushes. The tool is powerful, but the approach is wrong. What works is breaking the problem into dozens of specialized steps,extracting terms page by page with full context, generating summaries at multiple levels, running deduplication passes, building relationship maps, verifying every page reference against the actual text. Our pipeline has 27 distinct phases and over 42,000 words of custom prompts to guide those stages.

The gap between the naive approach and the engineered approach is enormous, and that gap is really the core of Indexia’s value.

How does Indexia deal with hallucination when creating an index?

This is one of the most important design decisions we made. Every term Indexia extracts is grounded in the source text. Every term in the index is linked back to the specific passages where that concept actually appears, and the user can see these excerpts directly. So rather than the AI generating index terms from its own “knowledge,” it’s pointing to real content in your manuscript.

Concretely, when Indexia identifies a term, it also captures the surrounding context. You can click on any index entry and see exactly where in the book it comes from, which makes it easy to verify, refine, or remove entries that don’t belong.

We also have what we call an Editorial Agent, which is an AI reviewer grounded in the Chicago Manual of Style that audits the index for quality, flags potential issues, and provides confidence scores for its suggestions. Anything below a certain confidence threshold gets deferred for human review rather than being applied automatically. The philosophy is: be comprehensive in what you surface, but transparent about your certainty.

For authors who are writing a nonfiction book now, when should they start thinking about the index?

Traditionally, the advice has been to wait until final page proofs, because page numbers can shift during editing and layout. That’s still true. You need final pagination before you can produce a definitive index.

But we’d actually encourage authors to think about their index much earlier, even if they’re not building it yet. As you write, pay attention to your key concepts, the terminology you use for recurring ideas, and how your argument is structured. If you find yourself using three different phrases for the same concept across chapters, that’s worth noting.

In fact, for my next book, I might use indexes to evaluate interim drafts. Since Indexia is so much more affordable and takes minutes, it makes it practical to generate a draft index at any stage just to see how your ideas are mapping. I don’t think authors are doing this very much now, but I think it could be very helpful as a writing tool as well.

For the actual production index, we’d say: wait until your text is final and you know your publisher’s guidelines, then give yourself at least a few days to review and refine the AI’s output before your deadline.

What excites you most about the future of AI-assisted publishing tools?

What excites us most is the democratization of quality. Right now, the full publishing toolkit — professional editing, design, indexing, marketing — is really only accessible to authors backed by well-resourced publishers. An independent scholar or a first-time author often has to choose between quality and affordability.

AI tools are starting to change that equation by making baseline quality accessible to everyone. An author who can’t afford a $2,000 index can still have a well-structured, standards-compliant one. A small publisher producing 50 titles a year can offer the same production quality as the Big Five. There is no longer any reason for a book to be published without an index.

We’re also excited about the possibilities beyond traditional books. As more knowledge lives in digital formats — blogs, research repositories, legal databases, journalism archives — the need for structured navigation tools is only growing. Indexes are one of the oldest knowledge technologies we have, and they’re surprisingly well-suited to the modern information landscape.

If you could magically improve one part of the publishing process with technology, what would it be?

We hope that technology lowers the barriers of entry for new authors. Self publishing tools have already improved drastically over the past several years, and, often, self publishing authors actually make more money than those publishing with large publishing houses. If technology can continue to make the production process easier, then the market for books could become even more rich and less influenced by the incentives of large publishing houses.

For bloggers like myself, how do you see Indexia being used?

That’s a great question, and it’s actually a use case we’re excited about. If you’re a blogger with years of content — hundreds of posts across dozens of topics — you’re sitting on a body of work that’s essentially a book without an index. We built a special Substack tool on Indexia that is able to import all the articles from an account and index them, linking back to their source. Most bloggers probably don’t know this is a possibility, but we hope to see more blog-indexes in the future.

We’ve also seen interest from newsletter creators, journalists with long-running beats, and researchers who want to index their own collections of notes. Anywhere there’s a substantial body of text and a reader who wants to find their way through it, there’s a case for an index.

What’s next for Indexia?

We have a few things we’re really focused on right now. First, we’re continuing to improve the quality of our AI pipeline. We recently launched our Editorial Agent, which is an AI reviewer trained on the Chicago Manual of Style that audits indexes for consistency, catches duplicate entries, builds cross-references, and flags anything it’s uncertain about for human review. That’s been a big step toward professional-grade output.

Second, we’re working with academic publishers and university presses to explore how Indexia can fit into institutional publishing workflows. A lot of these organizations produce hundreds of titles a year and spend significant resources on indexing. We think there’s a compelling case for AI-assisted indexing at that scale.

Third, we’re working on building indexing tools beyond subject indexes. We recently launched Scripture Indexes and Legal Citation indexes. Legal is especially exciting since indexing might actually be used for making indexes of evidence in legal trials. We are actually working on a companion product called exhibitx.ai which is specifically designed to help lawyers and parties index evidence to prepare for trial. The possibilities are endless.

Is there anything else you would like to add?

Just a thank you to everyone who’s been using Indexia and giving us feedback. We’re especially grateful for the authors who took a chance on us early and helped us improve the product. Building something in a space with this much tradition and expertise is humbling, and we take the craft of indexing seriously, even as we try to make it more accessible.

If any of your readers are working on a nonfiction book, a blog archive, a dissertation, or any substantial body of writing, we’d love for them to try Indexia at indexia.tech. And we’re always reachable if anyone has questions or feedback.

Thanks for such thoughtful questions, Kriti. This was a lot of fun.

One of the things I enjoyed most about this conversation was discovering how much an index resembles a piece of writing in its own right. Reading Will and Ben’s answers reminded me that indexes also reflect interpretation. Someone has to decide which ideas matter, which concepts belong together, and how a reader might move through a book’s landscape of ideas. In that sense, an index is not unlike an outline, a map, or even a guidebook.

I realize now that useful indexes do more than just locate topics on a page numbers, they also show the relationships between ideas. Maybe that is why I keep returning to the image of a map. A book contains knowledge, stories, and arguments. An index helps readers find their way through them.

Many years ago, I started creating a book review index for the blog. With life changes, its update has fallen to the wayside. Maybe one day, I will be able to use a tool like Indexia to create something that organizes my reviews and interviews by genre automatically.

What role do indexes play in your reading life? Is there a book whose index—or structure—has stayed with you? I would love to hear about it in the comments.

Thank you so much for joining us today! 🙂

The Map at the Back of the Book: A Conversation with the Creators of Indexia

A Conversation with the Creators of Indexia

Posts you might also enjoy

Be First to Comment

What are your thoughts about this post? I would love to hear from you. :) Comments are moderated.Cancel reply