Open ideas, if you decide to build any of these let me know!

  • CLI tool that takes an image and sticks it on the web as a permalink. Currently copy images into GitHub and steal the link that it generates.
  • Reresume: upload a resume and a job description link and get a rewritten resume tailored to the job.
  • Sweresume: automatic LaTeX resumes for software engineers following a standard template. 1
  • Convert research papers to running code automatically, or at least a repo scaffold.
  • Match post writers across the web using stylometry.
  • Receipt extraction for splitting meals and groceries. Just take a picture of what you ate or purchased and have an LLM do the math. Could be possible when the GPT-V API drops. Fuyu-8B might work as well.
  • Build a knowledge graph of code blocks with an entire repository. There can be an agent that can operate on this graph asynchronously. I believe Cursor is working on something like this.
  • Interview prep using ChatGPT + voice mode.
  • Write a blog post evangelizing VS Code.
  • Embed your entire Twitter/Discord/etc and make connections from your likes, followers, etc.
  • Display some of the best pieces of writing on the Internet in a common place, with beautiful styling. For example all of the Paul Graham essays, some Twitter posts, Ted Chiang essays like Understand that only exist on old web archives.
  • Dscan: scan through your data and “swipe left/right” on good/bad examples to manually create a training set. 1, 2
  • Convert images to LaTeX, could be useful for converting complicated graphs or diagrams.
  • Automatically teach yourself using short form videos like the subway surfer TikToks. Can query content from anywhere - reddit, twitter etc. Need to make a more general data querying tool that I can use for things.
  • Summarize any subreddit with LLMs by querying the RSS.
  • Rebuild popular products like Postman as free, open-source tools.
  • An AR tool that lets you query references you like to make. Imagining something like Family Guy cutaways, effectively querying your memories.
  • A way to modify a calendar state by directly altering the JSON version with an LLM.
  • Embed solutions to LeetCode problems to compare similarity.
  • A language model to translate mixed languages like Chinglish or Hinglish.
  • AB testing for agentic web browsing.
  • Automatically create YouTube shorts/TikToks from subreddits.
  • Solve the issue where you have a ton of Jupyter notebooks all with slightly different variations of the same pre-processing code.
  • A little guy who organizes and cleans up your codebase while you’re not using your computer.
  • Link entire codebases together graphically and automatically and allow users to parse through the function calls in a spatial UI – “spatial coding”.
  • Map coding error messages to their solutions to build up a repository of personal debugging tricks automatically.
    • A model could use this data for its own exploration when its stuck, so it learns on past stacktraces + Google search.
  • GPT vision tools:
    • Better autonomous web browsing
    • Receipt extraction
    • Math homework helper
    • Image to LaTeX converter
    • Calorie counter by converting image to ingredients + use RAG to compare against nutrition facts
    • Recipe generator from ingredients
    • Diagram generator from whiteboard
    • Excel analyzer
    • Detect fake vs. real sneakers
    • Structure-anything: convert anything to structured data
    • An “anything API” since it can see the whole web
    • Something that replaces Selenium/Playwright for web scraping
    • Solve web accessibility
    • Analyze screenshots of audio waves
  • Actually good dev-tools for things like base64 conversion, OpenAI token counting, etc. Good for LLM devs and regular devs alike.
  • Better interface for constant web-scraping using this.
  • Convert an MIT course into useful lecture notes. Should add a sliding scale that lets you modify the difficulty, can do this by pre-generating content for different education levels. 1
  • Compressing thought and expanding it again is like noising and denoising in diffusion models.
  • Continuously summarize HN, Reddit, etc. and generate podcast episodes, similar to ScribePod.
  • Better way to map things you like in a city, Felt is doing a great job at this.
  • Generate color palette suggestions from aesthetic images.
  • Nice view components for Spotify and other apps to make it easy to drop on your website.
  • Chrome extension for bionic reading.
  • BeReal memories downloader.
  • Notion page for all LeetCode problems with better solutions.
  • Predict stable diffusion prompts from the images, can train on DiffusionDB.
  • LeetCode but for debugging, currently no way to practice scenario-specific debugging. Solved: 1
  • Stable diffusion + Recaptcha.
  • Twitter except it’s just posts your friends have liked.
  • An easy way to analyze scrapes of your own liked tweets in embedding space, something better than grep.
  • Website to help develop intuition on hard math/CS concepts, best quality explainers.
  • Watermarking handwriting/typed text algorithmically, something like what Scott Aaronson worked on at OpenAI.
  • Grammarly + GPT.
  • Teach students things on TikTok by distilling MIT OCW into short form clips for coding and such.
  • Hack viral short form marketing for any product launch.
  • Train a model to chunk a long passage into human friendly sections. 1
  • Convert git commit history to a changelog. It could be something you add to a repo (GitHub action?) that keeps the README updated for instance.
  • Take any data and convert it into a knowledge graph, with a simple API that makes it useful for other projects.
  • Build a realtime navigation tool for blind people into an iPhone app. 1
  • Visualize information as embeddings, like taking a textbook and converting it into a cloud graph. 1, 2
  • Segment a webpage → embed it → select pieces based on a nearest neighbor search of the embeddings/query. Requires contrastive data for a web snippet/text, like CLIP.
  • Try using CogVLM to break Captchas. 1
  • Build a text editor that periodically takes snapshots of the text area and sends them to GPT for autocorrect, an auto editing writing editor basically.
  • Finetune an LLM on your own tweets.
  • Rebuild this but with GPT-V.
  • Route to any model, cloud-based or local, with ease.
  • Visualize loss curves in 3D with some cool interface.
  • So much to build with Anthropic MCP, Anthropic Computer Use, and Gemini Multimodal Live API.
  • Generate UI with a few keystrokes.
  • Customize podcasts for your ears – speaker diarization to speed up individual voices, edit pods at high level, easy to download to your device.
  • What to do while waiting for reasoning LLMs to think? Potentially a huge advertising play here.
  • Leetcode, except every problem is a real world scenario instead of the technical problem statements. The purpose is to understand where algorithms are applied. Could implement using a Chrome extension that calls out to an LLM to convert the problem statement into a real world scenario, and still run the code on Leetcode.
  • Automatically run git bisect when debugging and tell an LLM what bug you’re looking for. Like you can ask it to find which commit broke a button or something, and it can go render the page at different states using bisect.
  • Goodreads with minimal, modern UI.
  • Chrome extension that lets you hide people on Twitter who are being annoying, using something funny like a CSGO AWP.
  • Vibe check industries by taking the average sentiment of a subreddit (“how are the folks on r/CSMajors doing today?“)
  • Track company job openings over time to see if they’re actually reducing hiring due to the advent of AGI.
  • Is it AI? A game to determine if a poem is AI generated or not.
  • Crowdsourced map of AI startup offices to see how the epicenter of AI changes over time.
  • March madness prompt battle. Select a model, write a prompt, and pay $5 to let your AI bracket compete against others.
  • More vibe coding with ThreeJS. 1.
  • Stock analyst in your email. Give it a few tickers you care about and it’ll use Perplexity’s API for research + Claude for summaries.
  • Text a number to create a Linear ticket (or maybe use iOS shortcuts).
  • Let people vote on quality of software tools - realtime poll of the best languages, apps, etc. Login with GitHub. Stuff like Flighty and Beli that flies under the radar, should make them more visible.
  • Scrape SF housing data from Craigslist in realtime and filter for the best deals with LLMs.
  • Auto-updating spreadsheet of all niche SaaS apps and stats.
  • Automatically route to the best LLM for a given task based on current latency.
  • Scrape 4chan for alpha from the LLM/CV communities.
  • Run stylometry on popular Twitter accounts to figure out if people are ghostwriting for others.
  • There may be something interesting to do with NotebookLM if they ever release an API.
  • GitHub style commit graph for Congress. 1
  • There are a million companies to start that use GPT-4o native image
  • Cool ways to store data like chess games. 1
  • Solve Dunnet with LLMs automatically.
  • LLMs for analyzing logs and automatically finding errors.
  • Guess how many likes a Tweet or TikTok has by reading the content alone, to tune your internal model.
  • Cursor extension for adding meta comments that only the LLM can see.
  • Read every word a person has ever written by having LLMs scrape the internet for everything.
  • Unedited, real-time timelapses of people using AI, to understand how the most effective work is done.
  • Post what you got done this week anonymously.
  • Use Claude computer use or BrowserBase for headless automation.
  • Use some of the new video models like Veo, HeyGen, Ideogram, Runway, etc.
  • LaTeX codegen Vision LM: let the LLM generate LaTeX, see the output and iterate.
  • PDF to brainrot.
  • Dwarkesh podcast generator for papers and articles, turn any text into a podcast (like NotebookLM).
  • Use dithering for something cool. Visual Electric is also a really cool image generation tool.
  • Unbrick the Car Thing from Spotify.
  • Map OpenAI releases to employee GitHub activity.
  • Track terrible journalist takes from a long time ago on a website. Maybe some way to rank and grade them?
  • Pictionary except you have to guess the prompt used to generate the image, use something like FAL.
  • Compute sentiment of quotes of a Tweet, figure out if people are mad at it.
  • Let Claude control an iPhone with iPhone mirroring + computer use.
  • Pictionary where you draw and the LLMs try to guess.
  • A playground to quickly iterate on AI UX interfaces (I guess v0 is basically this).
  • hnfast.com: get the high level updates.
  • GazeLLE for % eye contact made in a podcast, figure out who can keep eye contact. generation in some clever way.
  • Shortcut (maybe text) for creating calendar events from text.
  • HUD for autocomplete while you’re in a meeting, so you can easily figure out what to say next.
  • Automatically ingest the Internet on a user’s behalf and screen it for memetic viruses. “Internet condom”. 1