6 lessons from leveraging data & AI as a micro VC fund with $50M AUM
by Alex Patow, Data Analytics Engineer at Inflection | Originally published on Inflection.

Guest post by Alex Patow, Data Analytics Engineer at Inflection | Originally published on Inflection.
Data-Driven Investing as a Micro Fund
At Inflection, we firmly believe that the future of VC is “data-driven”, especially for funds our size. We also equally value the human element of venture capital. There's no substitute for deep connections with founders, other VCs, and our own intuition. We believe this combination of data and human insight is where outsized returns become possible.
Being data-driven, for us, means building tools that augment the entire team's work, not just that of investors. We focus on enhancing all aspects of our operations, not only sourcing investments. As the sole "data person" in our six-member team, my role involves wearing many hats, but ultimately, my mission is to scale the efficiency and impact of our team through strategic use of data and software.
Second Wave of Data-Driven VC
We’re now entering our second wave of “data-driven VC”. Early pioneers in this space (such as EQT, Earlybird, Moonfire, amongst others) had to build substantial infrastructure from scratch:Complex pipelines to wrangle spotty and immature data sources
Training and hosting custom models
Creating platforms due to a lack of VC-specific tools (particularly CRM systems)
While there's still value in prioritizing in-house development, it requires a large team to implement effectively (something that’s not realistic for funds of our size).
Fortunately, the industry has evolved significantly since the first wave:
Proliferation of (relatively) stable, cheap, and reliable Large Language Models (LLMs)
Decreasing cost and increasing availability of out-of-the-box data and tools, specifically targeted at our industry (we’re big fans of Specter, Gravity, People Data Labs, and Attio for our needs)
Increasing engineering efficiency through AI-assisted coding, data transformation, and cloud deployment tools
We expect these trends to accelerate, with advancing technologies enabling even smaller funds to compete effectively, shifting the key differentiator from data infrastructure to insight interpretation and action.
To funds looking to become data-driven: the time is now.
Our First Year
Inflection’s initial blog post on “An Engineering Approach to Venture Capital” was published just over a year ago. I started with the firm in November of last year.
Since then, we’ve built some really cool stuff, we’ve also done some really “boring” work:
Launched our internal deal sourcing tool, Pathfinder
Utilizes internal and external data, LLM agents, and a graph database for sourcing opportunities in our sector
Automatically populates our CRM with sourced opportunities for the investment team's review
It’s currently evaluating roughly 500 new companies and 4000 founders per week, a task that would be impossible for us without tooling
Brought in best-in-class tools for deal tracking and portfolio hiring
Reduced the time for creating valuation memos for our audit process by 80%
Achieved through LLM Agents and Retrieval-Augmented Generation (RAG)
Acquired signal and people data sources to support our projects
Refreshed our website
Streamlined our toolset by eliminating unused tools
Like I said, not everything is glamorous, but it’s important for being a holistic, data-driven fund.
Lessons Learned
Although we are early on in our journey, we learned some lessons along the way:
#1 Start Small, as Small Projects Snowball
While building our data-driven sourcing tool (Pathfinder), we were asked by the operations team to help automate the creation of this year's valuation memos. These letters summarize the investments we’ve made, their current valuation, and the rationale for their valuations for the audit process. As there’s one memo for each investment, it’s typically a time consuming task to complete manually. By doing a small, one-week project we were able to validate Pathfinder's tool stack (specifically LLM Agents) in a controlled environment.
These learnings significantly accelerated our progress on sourcing. Importantly, it also helped achieve buy-in from our operations team on the vision of a data-driven fund. We've found that starting with small, manageable projects can lead to unexpected benefits and growth for longer-term strategic projects.
#2 Take Calculated Risks
The first iteration of Pathfinder relied on a linear process where LLM agents gathered the information needed to evaluate potential investment opportunities as they arrived in the system. While functional, this approach had limitations in providing context and integrating existing results, often leading to confusing outcomes. Recognizing these limitations, we decided to take a calculated risk by rethinking our data architecture and transitioning to a graph database.
We believed that modeling our database to mirror the entities and relationships investors use in assessing companies would lead to better results from our LLM agents. This required substantial rework, but the results have been significantly more aligned with our investment thesis and early-stage opportunities. Adopting a GraphRAG approach has enhanced the agents' contextual understanding (see Neo4j's writeup on GraphRAG, and stay tuned for more on our blog). This decision exemplifies how embracing calculated risks, when guided by clear objectives, can drive innovation and unlock new value.
#3 Embrace Nimbleness
Some tools we build may only provide 3-6 months of “alpha” before external tools emerge and commoditize that part of our platform. When more mature, off-the-shelf solutions become available, we transition from in-house development to leveraging these external solutions. This approach allows us to redirect our focus and resources toward identifying and developing the next source of alpha.
#4 Secure Buy-in from the Team
Team support is crucial when implementing data-driven initiatives at a VC fund. Achieving reliable, scalable, and impactful results takes time, and not all efforts will succeed. Organizational buy-in enables experimentation and grants the independence needed for effective decision-making.
At Inflection, the team had already explored various Python scripts and external tools to enhance workflows, from podcast transcription to research summarization and web scraping. Recognizing the potential leverage of a technical role in the post-LLM era, the investment team outlined initial ideas on their blog.
To retain this alignment and maintain momentum, we implemented a structured approach:
A week-long kick-off session to set principles and direction
Weekly product calls and bi-monthly in-person work days
Bi-annual presentations on innovations and roadmaps
This structure fosters tight feedback loops and a culture of innovation.
#5 Find a Product-Engineering Hybrid
Look for someone who enjoys both product and engineering. In smaller funds, this role often falls to one person. Much of the job involves managing product, not just engineering in isolation. It's essential to have someone who can excel (and enjoys working!) in both areas to ensure what's being built matches the fund's needs. This should be a full-time role for someone passionate about building, not a part-time job for an investor.
#6 Develop Guiding Principles, Not a Backlog
Start by establishing guiding principles for the first 12 months, rather than specific projects. Let the projects emerge over time. For us, this meant creating an architectural philosophy that addressed questions like how to approach "buy vs. build", how to balance "speed of iteration vs. quality of output", and whether to focus on "platforms" or "projects". This approach gives us the comfort to get started without locking us into specific projects or implementation details, allowing for flexibility as we learn and grow.
What’s Next for Inflection
Continuously improving our sourcing tool, Pathfinder
Adding new data sources (research papers, social media, etc.) and giving our agents ability to expand the graph through their “intuition”
Crafting a “magical” experience for the investment team to input and extract information from our knowledge graph (we love the work USV has done with their “Librarian” tool)
Generate novel ideas for companies and areas of exploration based off of research and our thesis
Launching tools for our founders that go beyond just hiring talent
Actively growing the community of data-driven VCs by:
Open sourcing our work
Collaborating with our community and new data providers
Presenting at industry events
Hopefully this serves as a helpful guide for smaller funds looking to be “data-driven”. We believe that by embracing these practices, micro funds can punch above their weight and compete effectively in today's fast-paced and competitive venture landscape.
Please get in touch if you have any questions or would like to collaborate on this work!
