Open ideas, if you decide to build any of these let me know!
- CLI tool that takes an image and sticks it on the web as a permalink. Currently copy images into GitHub and steal the link that it generates.
- Reresume: upload a resume and a job description link and get a rewritten resume tailored to the job.
- Sweresume: automatic LaTeX resumes for software engineers following a standard template. 1
- Convert research papers to running code automatically, or at least a repo scaffold.
- Match post writers across the web using stylometry.
- Receipt extraction for splitting meals and groceries. Just take a picture of what you ate or purchased and have an LLM do the math. Could be possible when the GPT-V API drops. Fuyu-8B might work as well.
- Build a knowledge graph of code blocks with an entire repository. There can be an agent that can operate on this graph asynchronously. I believe Cursor is working on something like this.
- Interview prep using ChatGPT + voice mode.
- Write a blog post evangelizing VS Code.
- Embed your entire Twitter/Discord/etc and make connections from your likes, followers, etc.
- Display some of the best pieces of writing on the Internet in a common place, with beautiful styling. For example all of the Paul Graham essays, some Twitter posts, Ted Chiang essays like Understand that only exist on old web archives.
- Dscan: scan through your data and “swipe left/right” on good/bad examples to manually create a training set. 1, 2
- Convert images to LaTeX, could be useful for converting complicated graphs or diagrams.
- Automatically teach yourself using short form videos like the subway surfer TikToks. Can query content from anywhere - reddit, twitter etc. Need to make a more general data querying tool that I can use for things.
- Summarize any subreddit with LLMs by querying the RSS.
- Rebuild popular products like Postman as free, open-source tools.
- An AR tool that lets you query references you like to make. Imagining something like Family Guy cutaways, effectively querying your memories.
- A way to modify a calendar state by directly altering the JSON version with an LLM.
- Embed solutions to LeetCode problems to compare similarity.
- A language model to translate mixed languages like Chinglish or Hinglish.
- AB testing for agentic web browsing.
- Automatically create YouTube shorts/TikToks from subreddits.
- Solve the issue where you have a ton of Jupyter notebooks all with slightly different variations of the same pre-processing code.
- A little guy who organizes and cleans up your codebase while you’re not using your computer.
- Link entire codebases together graphically and automatically and allow users to parse through the function calls in a spatial UI – “spatial coding”.
- Map coding error messages to their solutions to build up a repository of personal debugging tricks automatically.
- A model could use this data for its own exploration when its stuck, so it learns on past stacktraces + Google search.
- GPT vision tools:
- Better autonomous web browsing
- Receipt extraction
- Math homework helper
- Image to LaTeX converter
- Calorie counter by converting image to ingredients + use RAG to compare against nutrition facts
- Recipe generator from ingredients
- Diagram generator from whiteboard
- Excel analyzer
- Detect fake vs. real sneakers
- Structure-anything: convert anything to structured data
- An “anything API” since it can see the whole web
- Something that replaces Selenium/Playwright for web scraping
- Solve web accessibility
- Analyze screenshots of audio waves
- Actually good dev-tools for things like base64 conversion, OpenAI token counting, etc. Good for LLM devs and regular devs alike.
- Better interface for constant web-scraping using this.
- Convert an MIT course into useful lecture notes. Should add a sliding scale that lets you modify the difficulty, can do this by pre-generating content for different education levels. 1
- Compressing thought and expanding it again is like noising and denoising in diffusion models.
- Continuously summarize HN, Reddit, etc. and generate podcast episodes, similar to ScribePod.
- Better way to map things you like in a city, Felt is doing a great job at this.
- Generate color palette suggestions from aesthetic images.
- Nice view components for Spotify and other apps to make it easy to drop on your website.
- Chrome extension for bionic reading.
- BeReal memories downloader.
- Notion page for all LeetCode problems with better solutions.
- Predict stable diffusion prompts from the images, can train on DiffusionDB.
- LeetCode but for debugging, currently no way to practice scenario-specific debugging. Solved: 1
- Stable diffusion + Recaptcha.
- Twitter except it’s just posts your friends have liked.
- An easy way to analyze scrapes of your own liked tweets in embedding space, something better than grep.
- Website to help develop intuition on hard math/CS concepts, best quality explainers.
- Watermarking handwriting/typed text algorithmically, something like what Scott Aaronson worked on at OpenAI.
- Grammarly + GPT.
- Teach students things on TikTok by distilling MIT OCW into short form clips for coding and such.
- Hack viral short form marketing for any product launch.
- Train a model to chunk a long passage into human friendly sections. 1
- Convert git commit history to a changelog. It could be something you add to a repo (GitHub action?) that keeps the README updated for instance.
- Take any data and convert it into a knowledge graph, with a simple API that makes it useful for other projects.
- Build a realtime navigation tool for blind people into an iPhone app. 1
- Visualize information as embeddings, like taking a textbook and converting it into a cloud graph. 1, 2
- Segment a webpage → embed it → select pieces based on a nearest neighbor search of the embeddings/query. Requires contrastive data for a web snippet/text, like CLIP.
- Try using CogVLM to break Captchas. 1
- Build a text editor that periodically takes snapshots of the text area and sends them to GPT for autocorrect, an auto editing writing editor basically.
- Finetune an LLM on your own tweets.
- Rebuild this but with GPT-V.
- Route to any model, cloud-based or local, with ease.
- Visualize loss curves in 3D with some cool interface.
- So much to build with Anthropic MCP, Anthropic Computer Use, and Gemini Multimodal Live API.
- Generate UI with a few keystrokes.
- Customize podcasts for your ears – speaker diarization to speed up individual voices, edit pods at high level, easy to download to your device.
- What to do while waiting for reasoning LLMs to think? Potentially a huge advertising play here.
- Leetcode, except every problem is a real world scenario instead of the technical problem statements. The purpose is to understand where algorithms are applied. Could implement using a Chrome extension that calls out to an LLM to convert the problem statement into a real world scenario, and still run the code on Leetcode.
- Automatically run git bisect when debugging and tell an LLM what bug you’re looking for. Like you can ask it to find which commit broke a button or something, and it can go render the page at different states using bisect.
- Goodreads with minimal, modern UI.
- Chrome extension that lets you hide people on Twitter who are being annoying, using something funny like a CSGO AWP.
- Vibe check industries by taking the average sentiment of a subreddit (“how are the folks on r/CSMajors doing today?“)
- Track company job openings over time to see if they’re actually reducing hiring due to the advent of AGI.
- Is it AI? A game to determine if a poem is AI generated or not.
- Crowdsourced map of AI startup offices to see how the epicenter of AI changes over time.
- March madness prompt battle. Select a model, write a prompt, and pay $5 to let your AI bracket compete against others.
- More vibe coding with ThreeJS. 1.
- Stock analyst in your email. Give it a few tickers you care about and it’ll use Perplexity’s API for research + Claude for summaries.
- Text a number to create a Linear ticket (or maybe use iOS shortcuts).
- Let people vote on quality of software tools - realtime poll of the best languages, apps, etc. Login with GitHub. Stuff like Flighty and Beli that flies under the radar, should make them more visible.
- Scrape SF housing data from Craigslist in realtime and filter for the best deals with LLMs.
- Auto-updating spreadsheet of all niche SaaS apps and stats.
- Automatically route to the best LLM for a given task based on current latency.
- Scrape 4chan for alpha from the LLM/CV communities.
- Run stylometry on popular Twitter accounts to figure out if people are ghostwriting for others.
- There may be something interesting to do with NotebookLM if they ever release an API.
- GitHub style commit graph for Congress. 1
- There are a million companies to start that use GPT-4o native image
- Cool ways to store data like chess games. 1
- Solve Dunnet with LLMs automatically.
- LLMs for analyzing logs and automatically finding errors.
- Guess how many likes a Tweet or TikTok has by reading the content alone, to tune your internal model.
- Cursor extension for adding meta comments that only the LLM can see.
- Read every word a person has ever written by having LLMs scrape the internet for everything.
- Unedited, real-time timelapses of people using AI, to understand how the most effective work is done.
- Post what you got done this week anonymously.
- Use Claude computer use or BrowserBase for headless automation.
- Use some of the new video models like Veo, HeyGen, Ideogram, Runway, etc.
- LaTeX codegen → Vision LM: let the LLM generate LaTeX, see the output and iterate.
- PDF to brainrot.
- Dwarkesh podcast generator for papers and articles, turn any text into a podcast (like NotebookLM).
- Use dithering for something cool. Visual Electric is also a really cool image generation tool.
- Unbrick the Car Thing from Spotify.
- Map OpenAI releases to employee GitHub activity.
- Track terrible journalist takes from a long time ago on a website. Maybe some way to rank and grade them?
- Pictionary except you have to guess the prompt used to generate the image, use something like FAL.
- Compute sentiment of quotes of a Tweet, figure out if people are mad at it.
- Let Claude control an iPhone with iPhone mirroring + computer use.
- Pictionary where you draw and the LLMs try to guess.
- A playground to quickly iterate on AI UX interfaces (I guess v0 is basically this).
- hnfast.com: get the high level updates.
- GazeLLE for % eye contact made in a podcast, figure out who can keep eye contact. generation in some clever way.
- Shortcut (maybe text) for creating calendar events from text.
- HUD for autocomplete while you’re in a meeting, so you can easily figure out what to say next.
- Automatically ingest the Internet on a user’s behalf and screen it for memetic viruses. “Internet condom”. 1