Five AI tools I pay for every month — and two free ones I can’t get used to
What $2,300 a year buys a working researcher — and what it doesn't.
The manifesto went up a few days ago. Within a few hours the first email arrived: “So which tools are you actually using?” Fair question. People talk about AI fatigue, and I understand it. Every week there is a new product that promises to compress a year of work into an afternoon. Most of them I look at for ten minutes and forget.
But five sit on my credit card every month. Three I open daily and think about. A fourth sits so deep in my workflow that I almost forgot it while drafting the math at the top of this post, which is itself an interesting fact about how that tool has worked out. A fifth I pay for and barely open. Two others sit on my desktop, free, and I keep walking past those as well. I have not cancelled anything; I just have not gotten anywhere with them. Below is what I pay for, what it costs, and what comes back per euro spent. Honest amateur-from-use, not a vendor review.
The math first
Five subscriptions, what I actually pay:
Claude Max: $100 per month, $1,200 per year.
Elicit Pro: $49 per month, $588 per year.
Consensus Pro (annual): $107 per year, about $9 per month equivalent.
Perplexity Pro (annual): $200 per year, about $17 per month equivalent.
ChatGPT Plus: $17 per month, $204 per year (legacy price — I have had this since early on and suspect the current rate is higher).
Total: around $2,300 per year. That is roughly the price of one trip to a small European conference, plus a decent dinner with the speakers. The question is what each one earns back.
1. Claude Max + Cowork: the workhorse
Claude is the one I would keep if I could only keep one. I run the desktop app in Cowork mode, which matters more than the price tag suggests. Cowork lets the model work directly in a folder on my computer: read files, write files, edit files. It is not a chat window shouting answers at me. It is a colleague that opens the cabinet and rearranges the shelf.
What it does, day to day, is maintain what I call my Obsidian Robot (thanks to David Sparks, “MacSparky”). A vault of around a thousand files that thinks and writes the way I do: identity files, role files, writing-style rules, folder hygiene, skill definitions, transcripts of past meetings, draft chapters, reading lists. Every morning Claude reads through the relevant parts and produces a daily briefing: what is on the calendar, which manuscripts are mid-edit, which PhDs are waiting on me, which deadlines are about to land. It pulls from the actual files, not from a generic prompt.
After that, it helps with whatever sits in front of me. Some concrete examples from the past two weeks:
A 90-minute Teams meeting on a clinical workflow protocol. The transcript went in (raw, with the usual misheard names), and a clean meeting report came out: decisions, action items, open questions, with each colleague’s name corrected automatically because the vault knows the transcript tends to mangle them.
A manuscript a senior PhD student is finishing. Track-changes edits in the shape of my voice rather than a generic editor’s. The student then accepts or rejects each one; my job is to mark what is unclear, not to rewrite their thinking.
A letter of support for one of my African PhDs applying for a fellowship. Three drafts before the tone was right, all of them shaped by what the vault already knows about the candidate.
A short email to my assistant, asking her to find a slot for a thesis defence review. Claude drafts it in my voice and waits for my approval. I have a hard rule that nothing leaves on my behalf without that approval, and Cowork respects it.
This very post, including the cleanup of my vault that happened yesterday when I complained it had become messy. Half a thousand files in the wrong folders, an outdated identity document, an empty home page. Eighty minutes later all four problems were resolved. I would have spent a Sunday on it manually, if I had spent the Sunday at all.
The limit I run into is context length on heavy days. Long PDF in, summary out, draft on top — three steps that sometimes hit the wall, and I split the work. The Max plan eases the wall, it does not remove it. I would also like better handling of slide images; for now image work goes through other channels.
If you write or read or organize for a living, this one carries the most weight.
2. Elicit Pro: for structured literature work
I use Elicit Pro for one job: structured literature work. A recent example is a background literature update for a manuscript I am co-supervising on hyperthermia and cytokine response. Background sections in this field age fast: what looked like the picture two years ago has gained three new safety datasets and lost one assumption. A current map is needed before the introduction can be written honestly.
The old workflow was three days in PubMed with a spreadsheet, a colleague to cross-check the search string, and a stack of printed abstracts on the corner of the desk. With Elicit Pro it goes like this. I define the question: what is the published evidence on cytokine response under controlled hyperthermia in solid tumors, with attention to temperature, duration, and tumor type? I run a report. Back comes a structured table of papers with the columns I asked for: sample size, intervention parameters, primary outcome, effect size, study design, country. I read the originals. Elicit does not replace that reading; it replaces the search-and-sort step that used to eat the first two days of every review. Financial-interest disclosure at the bottom of this post; read it.
A second use case from last month. One of my PhD students in South Africa is writing a systematic review on cervical-cancer-screening uptake in low-resource settings. The old pattern would have been: she sends drafts, I mark up the methodology, we iterate over weeks. Instead we did the first pass together in Elicit, side by side. She defined the question; I pointed out which columns she had forgotten and why. The point was less the speed and more the supervision. She now sees how I think about inclusion criteria, not just what I eventually conclude. That is closer to how I want to mentor: visible reasoning, not just verdicts.
A third, smaller use case. Before a thesis-committee meeting for one of my Belgian PhDs working on hyperthermia and the immune system, I run a quick scoping report in Elicit on whatever the student frames as their original contribution. Not to catch them out, but to know what the panel is likely to ask. It saves me an evening of reading and makes my questions sharper.
Pro at $49 per month gives me twelve reports per month, more than I use. The lower plan caps at a level I outgrow within a week of any active project. Honest: if you run one systematic review every two years, you do not need Pro. If you are running a trial, supervising PhDs running them, or preparing a grant, you do.
What I do not trust Elicit for is anything outside its literature index. It is a research-assistant on papers, not a general reasoner. I keep separate tabs open.
3. Consensus Pro: the in-meeting answer
Consensus is the cheapest of the five and the one I open most often when I need an academic answer, because the questions I take to it are small and frequent. A single point. Usually inside a meeting, sometimes mid-paragraph while editing, sometimes between calls when a colleague’s question is still ringing in my head.
Three real questions from the past month, with what I did with the answer:
A reading-group discussion on a recent paper. The question that came up: has IL-15 upregulation under hyperthermia been demonstrated in solid tumors specifically, as opposed to hematological ones where most of the early work sits? Five minutes with Consensus gave me four relevant papers, ranked, with one-line summaries of where the evidence sits. I scanned the abstracts during the meeting and could contribute without bluffing. The conversation moved on with a sharper basis.
A draft of a slide I was preparing for a regional oncology meeting. I had written that whole-body hyperthermia “stimulates the immune system,” a phrase too loose for anyone who actually reads the literature, and a sharp questioner from the audience will catch it. Consensus brought me four mechanistic papers that justify a tighter version of the claim, the one I am willing to defend. The slide was rewritten the same evening before the deck went out.
A reviewer comment on a manuscript: “your introduction overstates the prevalence of high-risk HPV in sub-Saharan settings.” Was the reviewer right? Forty-five seconds in Consensus, three papers, the answer was yes, partial but yes. The introduction was rewritten, the reviewer was thanked in the response letter, and the paper was stronger for it.
What Consensus is not, is a synthesis tool. It points at the papers; it does not draw the picture. For anything that needs a defensible conclusion I still go to the originals. The value is in skipping the twenty minutes I used to spend setting up a PubMed search to land on the one paper I actually needed.
Pro on the annual plan is $107 for the year, about $9 per month equivalent. A free tier with twenty searches per month exists and is usable if you only reach for it occasionally. I take the annual plan because I use it many times that.
4. Perplexity Pro: the one that has become invisible
Honest correction. While drafting the math at the top of this post, my first list had three subscriptions. There is a fourth, and I almost missed it because it has dissolved into how I work. Perplexity Pro, $200 per year on the annual plan, about $17 per month equivalent. It has completely replaced Google search for me. I use it more often than Consensus, more often than anything else on the list, multiple times a day.
The simple test: I cannot remember the last time I typed into google.com. Months at least. When I want a fact, a definition, a current price, the name of a tool somebody mentioned in a meeting, a news headline, a quick directional read on a regulatory body, an opening hour, an address, the name of a piece of music I half-recognize, anything that the old me would have googled, I now open Perplexity. It gives me a short, sourced answer with the links underneath, in the time Google takes to load its ad-heavy result page.
Three uses from yesterday and today, of the kind I would not normally write down because they are too ordinary:
A clinical-trials.gov detail I needed for a question about competing trials in a tumor type adjacent to my own.
The Brussels address of an organization whose name I had only half-remembered from a recent meeting.
A check on whether a specific journal sits on the predatory list a colleague had sent me by email, before I told a student to consider submitting there.
For deep literature work I still go to Consensus or Elicit, where the index is curated to peer-reviewed sources and I can rely on the ranking. One caveat, and it matters: search the same question twice with differently framed prompts — one sceptical, one credulous — and Perplexity will tend to confirm whichever framing you bring. I have noticed this when investigating a regulatory question from two angles; the second search was noticeably more supportive of the second framing. For anything with real consequences, verify with Elicit or go to the original literature. For everything else, Perplexity. It is the most invisible of the five because it does not feel like an AI tool. It feels like a fixed search bar that gives better answers than the one I used for twenty years.
If you have not tried it and you have read this far, take twenty dollars and run a one-month experiment. My prediction is that you will not go back. Mine is the strongest recommendation in this whole post, even though I almost forgot to make it.
The three I keep walking past
Google Gemini. I tried it through Gmail, where Google pushes it hard. The integration looks promising on paper: read the inbox, summarize threads, draft replies. In practice the answers were often wrong or beside the point. A meeting request from a CRO project manager was filed under “newsletter”. A thread-summary about a regulatory submission missed the actual ask, which sat in the third paragraph of a long mail. A draft reply to an African collaborator opened with a tone I would never use, friendly to the point of slick. After a few rounds I stopped opening it. It cost me more in re-checking than it saved. It may improve, and I will try again. For now, no.
NotebookLM. This is the one I want to like. The idea fits how I work: upload your sources, ask questions inside the boundary of those sources. The interface, though, never clicks for me. The few times I sat down to use it (uploading a stack of hyperthermia papers, trying to ask a comparative question across them) the answers were fine, but the friction was real. And I noticed I could do the same thing in Claude Cowork with my Obsidian vault as the source, plus everything else Cowork lets me do, including writing back, editing, and running small scripts. The marginal value of opening a second tool was not there. I keep it installed, in case the interface improves, but my hand never reaches for it.
ChatGPT. I use it rarely, and I cannot fully explain why. I do pay for it — ChatGPT Plus, $17 per month at a legacy price I have had since early on. It is a capable model, competitive in most benchmarks I have seen, and it was the first AI I ever heard of, which at least means I have no excuse of ignorance.
Part of the explanation is probably habit: I arrived at Claude before ChatGPT had time to form one. Part of it may be that the Cowork integration fits my workflow tightly enough that switching to another model for the same tasks offers no clear gain.
What has changed recently is Codex, OpenAI’s coding agent. I do not know yet whether it shifts my behaviour. A proper comparison — Codex versus Claude Cowork, running the same tasks I actually use — is something I intend to write about. Until then, ChatGPT sits installed and rarely opened.
None of the three is cancelled. Gemini and NotebookLM cost nothing. ChatGPT costs $17 per month and I keep not cancelling it. They all sit unused, which is its own answer.
The rule I follow
After three years of trying every tool a colleague mentions, one rule has held: a tool earns its place only if it fits a workflow I already have. The question is not “which AI is the best.” It is “which AI lets me stay in my workflow instead of forcing me to rebuild around it.”
Claude with Cowork sits inside the Obsidian vault I already use. Elicit sits inside the systematic-review process I already run. Consensus sits inside the in-meeting-question flow that already happens to me twenty times a week. Gemini wanted me to live in Gmail. NotebookLM wanted me to live in NotebookLM. Neither is a workflow I want.
If a tool requires me to change how I work to get its benefit, the bar is much higher. I keep paying for the ones that fit.
Where I am still uncertain
A few things I have not solved and would welcome notes on:
Reference manager + AI. I use Zotero. Nothing I have tried yet integrates cleanly without a clunky export-import step. If you have a setup that works, tell me.
Bandwidth-light AI for PhDs in Kisumu or Addis. Most of my students work over connections where a 200 MB model download is a problem. Almost everything I use assumes broadband.
AI for slide review. My core craft is pathology slides. The current image-analysis tools are not at the level I would trust for a clinical decision, and I have stopped wasting time on them. If that changes, I will write about it.
Replies welcome in the comments. I read them.
Disclosures. I pay for Claude Max ($100/month), Elicit Pro ($49/month), Consensus Pro on the annual plan ($107/year), Perplexity Pro on the annual plan ($200/year), and ChatGPT Plus ($17/month, legacy price). I have no commercial relationship with any of these vendors and have not received complimentary access from them. Separately, I am CEO and shareholder of ElmediX, a Belgian MedTech company developing whole-body hyperthermia, which sets the field context for several examples in this post. None of this affects my views on the AI tools, but you deserve to know it.


