Levix Space Station Weekly #11

What is the Agent2Agent (A2A) Protocol?#

The Agent2Agent (A2A) Protocol is a new open standard released by Google, aimed at addressing the challenges of collaboration between current AI agents. While existing AI agents can perform specific tasks, they rely on customized code for inter-agent communication, making cross-agent collaboration complex and inflexible. The A2A Protocol defines a lightweight communication mechanism that allows one agent to discover, authenticate, and receive result streams from another agent without needing to share prompt contexts or re-implement authentication mechanisms.

The emergence of the A2A Protocol addresses pain points in multi-agent collaboration, such as unstable handoffs, security constraints, and vendor lock-in. It enables client agents to communicate securely and efficiently with remote agents by defining "Agent Cards," "Task" state machines, and streaming "Messages" or "Artifacts." The A2A Protocol does not replace existing Model Context Protocols (MCP) but serves as a complement, filling the collaboration gap between agents.

In the A2A Protocol, the client agent is responsible for discovering the remote agent's Agent Card, determining whether it meets its authentication methods, and creating tasks via JSON-RPC messages. Once a task is created, the client agent acts as the task manager, listening for status events, forwarding inputs from remote agent requests, and ultimately collecting artifacts for later use. The remote agent is responsible for executing the core work of the task and providing feedback on task status and artifact updates to the client agent through streaming events.

The communication mechanism of the A2A Protocol is one-way, with only the client agent able to initiate JSON-RPC requests, while the remote agent is responsible for updating task statuses. This communication model resembles a "front-end and back-end" collaboration relationship, where the client agent receives new orders and conveys clarifications, while the remote agent focuses on completing tasks until results are delivered.

Architecturally, the A2A Protocol sits above the MCP and runs in parallel with workflow orchestration tools. It focuses on agent-to-agent communication at the network layer, while the MCP focuses on tool calls within a single agent. This layered design allows frameworks and vendors to innovate at the lower levels while maintaining the protocol's simplicity and universality.

In terms of security, the A2A Protocol ensures communication security through signed Agent Cards, multiple authentication methods, and runtime policies. Agent Cards can be signed using JSON Web Signature (JWS), allowing client agents to verify signatures to prevent tampering. Additionally, the A2A Protocol supports various authentication methods, including simple Bearer tokens, mutual TLS, and enterprise-level single sign-on processes. Remote agents can also check loads before model execution, rejecting excessively large or high-risk loads.

Regarding observability, each state or artifact event in the A2A Protocol carries timestamps, task IDs, and optional tracing headers. By wrapping the A2A client in OpenTelemetry middleware, end-to-end tracing can be easily implemented without manually parsing JSON data. This tracing data can be integrated into an enterprise observability platform to detect and resolve issues before they impact customers.

For discovery mechanisms, the current A2A Protocol discovery primarily relies on internal team YAML files or Google's Vertex AI directory. In the future, as public registries mature, a registry similar to npm for agents may emerge, but there is currently no unified "verification badge."

The A2A Protocol offers significant security improvements over the MCP. The MCP exposes natural language prompts of tools, making it vulnerable to injection attacks and parameter tampering. In contrast, the A2A Protocol hides these details within the remote agent's internals, allowing client agents to see only high-level tasks and restricted artifacts, thus eliminating the possibility of an entire category of attacks.

Despite the promising vision of the A2A Protocol, as of May 2025, its practical application remains in the early stages. About 50 vendors have announced support for the A2A Protocol, but most agents are still in the "private demo acquisition" phase. Currently, adapters for LangGraph, CrewAI, and AutoGen are relatively mature, while Flowise and n8n are still in community testing. Additionally, there is currently no public registry, and teams primarily rely on YAML files or Vertex AI's private directory to manage agents. Few agents provide signed Agent Cards, and rate limiting or billing caps need to be implemented through custom middleware. In terms of performance, Google's reference server experiences an increase of about 30 milliseconds in latency for each hop during local testing.

The A2A Protocol is suitable for cross-vendor workflows, security-sensitive black-box agents, hybrid tech stacks, and long-running tasks that require progress updates. It provides the ability for agents from different vendors to share handshakes, authentication, and streaming communication without exposing underlying prompt details. However, for agents running in the same process, small script tasks, one-time data pulls, or tools that primarily rely on complex parameter validation, the A2A Protocol may be overly complex, and direct API calls or simple REST endpoints may be more appropriate.

The A2A Protocol offers a standardized, secure, and efficient solution for collaboration between AI agents. Although its ecosystem is still developing and maturing, it has already provided sufficient support for prototype development and internal workflows, with the potential to drive widespread multi-agent collaboration in the future.

#Google #AI #Agent2Agent

What is the Agent2Agent (A2A) Protocol?

AI for Mature Teams#

Alice Moore explores the current state of AI applications in development and the challenges they pose for mature teams. While AI tools like Bolt, v0, and Lovable can quickly generate applications, this "demo-first" approach to AI is not suitable for mature teams, as it brings issues in design, development, and marketing.

For designers, existing AI design tools generate colors, border radii, and fonts that are inconsistent with brand styles, leading to a breakdown of design systems and requiring significant time to fix these issues. Developers face similar dilemmas, as AI-generated code often consists of single, large components lacking tests, accessibility labels, and separation of concerns, making it difficult to maintain and refactor. Marketers can quickly generate landing pages, but the data on those pages is mostly fictitious, necessitating code rewrites to connect CMS and analytics tools, increasing development workload.

AI should empower mature teams rather than merely serve as a rapid generation tool. To achieve this, builders and users of AI tools need a mindset shift. Tool builders must respect existing tech stacks, ensuring that AI outputs align with the team's existing components, tokens, data, and tests; embed AI tools into the software already used by the team, such as Figma, IDEs, and headless CMS; and provide sufficient control for professionals to review and adjust AI-generated code. Professionals should provide context for AI, such as documentation, prototypes, design tokens, and test cases; take responsibility for the final results, ensuring user experience, performance, and copy quality; and recognize AI's limitations, viewing it as a translation layer between different professional domains rather than a replacement for team tools.

Alice discusses Builder.io practices in building "mature AI." Builder.io launched in 2019 as a headless CMS with a visual editor, aiming to enable developers to use their JavaScript framework components while allowing non-developers to arrange components without touching code. Its editor is based on the open-source project Mitosis, allowing components to be described once and compiled into any JS framework the team runs. Before the generative AI wave, Builder.io products already featured three "mature parts": a fully functional visual editor, deterministic component mapping, and real CMS data sources. This allowed Builder.io to focus on reducing tedious work when adding AI features rather than reinventing foundational functionalities.

Currently, Builder.io products operate through a series of progressively adoptable layers without requiring large-scale rewrites. For example, its "visual editor → code" feature allows users to generate pages through AI prompts and then manually adjust any classes, tokens, or breakpoints; the "Figma → visual editor (and code)" feature can convert Figma designs into clean, responsive framework code; the upcoming "repository → visual editor → repository PR" feature will allow users to import code directly from GitHub repositories for modification and automatically send it as a PR; the "component mapping" feature enables users to match Figma components with code components, ensuring that generated code uses real components and tokens; and the "Builder Publish" feature is a complete headless CMS that supports real-time content and analytics, allowing marketers to run A/B tests independently.

Despite progress, Builder.io continues to strive for improvement. Future work includes reducing manual context input, making component and token mapping more automated; providing deeper control, allowing advanced users to directly view and adjust various parts of AI drafts in the editor; and supporting a wider range of design systems to streamline the mapping process.

The development of AI should not burden humans with tedious cleanup tasks but should correct the 80/20 rule through "mature AI," allowing humans to focus on craftsmanship that cannot be faked by models. Tool builders need to provide reliable frameworks, while professionals need to provide context and accountability. Each component mapping, token locking, and manual review makes the next generation more predictable, and predictability is key to truly enhancing efficiency.

#AI #Thoughts

AI for grown-ups

Vibe Coding: A Roadmap to Becoming an AI Developer#

Gwen Davis provides developers with a detailed guide on growing from ordinary programmers to AI development experts. With the widespread application of AI technology across various fields, it is expected that by 2027, 80% of developers will need to master basic AI skills, making now an excellent time to enter this field.

Developers should first master several key programming languages and frameworks, such as Python, Java, and C++, which have widespread applications in AI and machine learning. Frameworks like TensorFlow, Keras, PyTorch, and Scikit-learn are also essential tools for developers. GitHub offers a wealth of learning resources, such as GitHub Learning Lab, The Algorithms, TensorFlow Tutorials, and PyTorch Examples, to help developers quickly enhance their relevant skills. Additionally, GitHub Copilot, as an AI-assisted programming tool, can provide real-time code suggestions, helping developers learn and use new programming languages and frameworks more efficiently.

In machine learning, Gwen advises developers to delve into key subfields such as deep learning, natural language processing (NLP), and computer vision. These areas not only drive the development of AI technology but also play important roles in practical applications. Developers can access relevant tools and tutorials through open-source projects on GitHub, such as Awesome Machine Learning, Keras, NLTK, and OpenCV, participate in Kaggle competition solution development, or contribute code to open-source AI projects with "good first issue" labels to gain practical experience.

Developers need to build an outstanding GitHub personal portfolio to showcase their skills and project achievements. This includes organizing code repositories effectively, highlighting excellent projects, creating a professional personal profile, using GitHub Pages to build a personal website, and actively participating in open-source project contributions. Through these means, developers can stand out in the developer community and attract the attention of potential employers or partners.

Gwen recommends that developers obtain GitHub Copilot certification to demonstrate their proficiency in using AI-driven tools to enhance development efficiency. GitHub offers certification courses covering topics such as AI-driven development, workflow automation, and integration with CI/CD pipelines. By studying official documentation, completing practical exercises, and applying GitHub Copilot in real projects, developers can prepare for the certification exam and earn a digital badge to showcase on LinkedIn, GitHub profiles, or personal portfolios, further enhancing their career competitiveness.

Gwen encourages developers to seize the opportunities brought by the AI revolution, utilizing the tools and resources provided by GitHub to start building and exploring AI projects, shaping the future of technological development.

#AI #Github #Guide

Vibe coding: Your roadmap to becoming an AI developer

A Comprehensive Look at the Powerful Mode of GitHub Copilot#

Alexandra Lietzke introduces the powerful features of GitHub Copilot—Agent Mode. Agent Mode is an autonomous, real-time, and synchronous collaboration tool capable of executing multi-step coding tasks based on natural language prompts, helping developers quickly transition from requirements to prototypes.

The core of Agent Mode lies in its ability to understand developers' intentions, build solutions, and iterate continuously until the desired results are achieved. It can analyze codebases for complete context, plan and execute multi-step solutions, run commands or tests, call external tools for specific tasks, and even suggest architectural improvements. Through a system prompt, Agent Mode can autonomously run commands, apply edits, detect errors, and adjust in real-time, allowing developers to clearly see its reasoning process and the tools used.

Agent Mode is suitable for developers of different levels. For beginners, it serves as a synchronous development tool that helps quickly build applications; for experienced developers, it can significantly enhance work efficiency, allowing them to focus on higher-level problem-solving. Agent Mode also supports installing more specialized tools through Model Context Protocol (MCP) servers or extensions, thereby expanding its functionality, such as automating GitHub workflows or extracting and analyzing repository data.

Developers can use Agent Mode in various ways, including refactoring code, migrating projects, writing tests, modernizing legacy code, automatically fixing code generation errors, adding new features, and building prototypes based on functional specifications or UI sketches. The article also notes that since Agent Mode is based on non-deterministic large language models (LLMs), its suggestions may vary even with the same prompts and contexts.

Developers can start using Agent Mode by opening the Copilot Chat view in VS Code and selecting Agent Mode, or previewing it in Visual Studio. Agent Mode can also be combined with other GitHub Copilot features, such as custom instructions, to adjust Copilot's responses based on developers' daily coding practices, tools, and development processes. Additionally, developers can choose different AI models to drive Agent Mode to meet various development needs.

Agent Mode provides developers with great flexibility, allowing them to customize it according to personal needs for building prototype applications, handling existing codebases, or automating low-level tasks in workflows. Developers can tailor Agent Mode to their styles and needs, enabling them to complete development work more efficiently.

#AI #Copilot #Agents #Github #Practice

Agent mode 101: All about GitHub Copilot’s powerful mode

Using GitHub Copilot for Test-Driven Development (TDD)#

Kedasha Kerr discusses how to leverage GitHub Copilot to implement test-driven development (TDD). Testing is an essential yet often tedious part of the development process, especially as codebases grow in size and complexity. GitHub Copilot can effectively help developers automate parts of the testing process, thereby improving development efficiency.

Kedasha first emphasizes the importance of testing, noting that it is a key means of ensuring code behaves as expected. There are various types of tests, including acceptance tests, integration tests, and unit tests. Among them, unit tests break code down into smaller units for testing, ensuring each unit operates correctly, thereby enhancing confidence in the entire application. Another advantage of unit tests is their automatable nature, allowing developers to run numerous tests to quickly assess code health and identify potential issues.

Kedasha then introduces how to use GitHub Copilot to write unit tests. Developers can use GitHub Copilot in Visual Studio Code by highlighting code snippets and invoking Copilot Chat, using the command /tests add unit tests for my code to generate test code. Copilot will provide testing plans and code suggestions based on the code, which developers can add to new files and execute tests by running the command python -m pytest.

TDD is a development approach where tests are written before the implementation code, with the core idea being to guide the development process through testing. A key concept of TDD is "red-green-refactor": first, write a test that fails (red phase), then write just enough code to pass the test (green phase), and finally refactor the code to optimize its structure while ensuring tests continue to pass. GitHub Copilot plays a particularly prominent role in TDD, allowing developers to generate test code by describing expected functionality to Copilot, and then letting Copilot generate implementation code, thus rapidly completing the development process.

Kedasha also highlights best practices for writing unit tests, including adding documentation for tests, keeping tests organized, creating testing tools to improve efficiency, and updating tests when code changes. Additionally, the article provides several resource links for developers to further learn how to use GitHub Copilot for test development.

#Copilot #Github #AI #TDD #Testing

GitHub for Beginners: Test-driven development (TDD) with GitHub Copilot

Chrome DevTools Updates#

Gemini Integration: Developers can modify and save CSS changes through Gemini and save changes to the local workspace. Additionally, Gemini can generate annotations for performance insights and add screenshots to performance traces, providing developers with more intuitive visual aids.

Performance Panel Updates: The performance panel has added two new insight features—duplicate JavaScript and legacy JavaScript. The former identifies large JavaScript modules that are loaded multiple times on the page, while the latter detects polyfills and transpiled code loaded for compatibility with older browsers, helping developers optimize code for modern browsers.

Workspace Connection Feature: Developers can connect local folders to the DevTools workspace, allowing changes to JavaScript, HTML, and CSS to be saved back to local source files, supporting both automatic and manual connections.

Lighthouse 12.6.0: The Lighthouse panel has been upgraded to version 12.6.0, adding performance insight audit features, allowing developers to try using these insights in the performance categories of Lighthouse reports.

Other Improvements: These include the addition of server timing parsing capabilities in the network panel, improvements to the "Use efficient cache lifetimes" insight feature in the performance panel, and increased support for screen readers across more features.

#Chrome #Gemini #DevTools

What's new in DevTools, Chrome 137 | Blog | Chrome for Developers

AI-Assisted Performance Analysis#

The AI-assisted performance analysis feature in Chrome DevTools is an experimental tool designed to help developers better understand and optimize web performance. This feature is currently available only in Chrome Canary 132 and higher. Developers can open the AI assistance panel in various ways, such as selecting specific performance insights (like LCP by phase, LCP request discovery, etc.) from the "Insights" tab of the performance panel and clicking the "Ask AI" button; or right-clicking an activity in the performance trace view and selecting the "Ask AI" option. Additionally, developers can enter "AI" in the command menu and run the "Show AI Assistance" command, or select "AI Assistance" from the "More Tools" menu to open the panel.

The AI assistance panel engages in dialogue based on the performance activities selected by the developer, with relevant activities displayed in the lower-left corner of the panel. If opened from performance insights, that insight will serve as the pre-selected context; if opened from the trace view, the selected activity will serve as the context. AI assistance uses timing data from the selected call tree to answer questions, and developers can view the raw data used by AI by clicking relevant buttons after the dialogue begins.

To help developers quickly start conversations, AI assistance provides example prompts, and developers can also input their own questions. If AI does not provide an answer, it may be because the question is beyond its capabilities, and developers can submit feedback to the Chrome team. AI assistance saves conversation history, allowing developers to access previous dialogues even after reloading DevTools or Chrome. Through the controls in the upper-left corner of the panel, developers can start new conversations, continue old conversations, or delete history.

Additionally, AI assistance offers a feedback mechanism, allowing developers to rate answers with "thumbs up" or "thumbs down" buttons or report inappropriate content. This feedback will help the Chrome team improve the quality of AI assistance responses and the overall experience. It is important to note that AI assistance is still in the experimental stage and may have some known issues, so developers should be aware of the relevant data usage policies before use.

#Google #AI

AI assistance for performance | Chrome DevTools | Chrome for Developers

Enhancing Gemini Nano: Delivering Higher Quality Summaries with LoRA#

The Chrome team, in collaboration with Google Cloud, has fine-tuned the Gemini Nano model using Low-Rank Adaptation (LoRA) technology, significantly improving the quality of text summary generation. The article details the functionality of the summary API, the experimental methods of fine-tuning, performance evaluations, and the advantages of LoRA technology, along with real-time inference demonstrations and feedback channels.

The summary API is a built-in feature of Chrome that can condense long text content into concise and understandable summaries, suitable for various scenarios, such as key point lists on news websites or sentiment summaries of product reviews. To meet different website needs for summary styles and lengths, the API offers various summary types, such as headline-style, bullet-point style, minimalist style, and engaging preview style. The article provides examples of how to use the summary API to generate summaries for Wikipedia pages and offers an online API demonstration platform for developers to test and experience.

The Chrome team explains how to enable the fine-tuning feature in Chrome Canary. Starting from version 138.0.7180.0, users can enable this feature through specific Chrome flags. Once enabled, users can enter specific commands in the DevTools console to download the supplementary LoRA model and begin experimentation.

In terms of performance evaluation, the Chrome team employed both automated evaluation and Autorater assessment methods. Automated evaluation checks the quality of model outputs through software, focusing on issues like formatting errors, sentence duplication, and non-English characters. Results show that the fine-tuned Gemini Nano significantly reduces formatting error rates, demonstrating better performance in summarizing both articles and chat logs. The Autorater assessment uses the Gemini 1.5 Pro model to score the output quality of Gemini Nano, with evaluation metrics including coverage, factuality, formatting, clarity, and additional metrics for different summary types, such as attractiveness, conciseness, and engagement. Evaluation results indicate that the LoRA fine-tuned Gemini Nano outperforms the base model across all metrics.

The core of LoRA technology lies in guiding the model toward desired directions by adding small additional components rather than adjusting all model parameters. This approach can significantly reduce training time and costs, requiring only 2% of the original parameters to achieve significant output changes. The Chrome team demonstrates the capability of LoRA technology in enhancing summary quality and meeting specific instructions by comparing summary examples generated by the base Gemini Nano model and the LoRA fine-tuned model.

Additionally, a real-time inference demonstration showcases the performance differences between Gemini Nano and Gemini Nano with LoRA in generating the "tl;dr" summary of the article on "Ocean Sunfish." Through fine-tuning, Gemini Nano is better able to generate summaries that align with specific instructions.

The Chrome team encourages developers to try the updated model in Chrome Canary and share feedback. Developers can submit bug reports or feature requests through Chromium's issue tracker to help improve Chrome's implementation.

#Chrome #Google #AI

Enhancing Gemini Nano: delivering higher quality summaries with LoRA | Blog | Chrome for Developers

ChatGPT and the Proliferation of Obsolete and Broken Solutions to Problems We Hadn't Encountered for Over Half a Decade Before Its Launch#

ChatGPT faces issues when generating code, particularly that its solutions may be outdated or even incorrect, yet are widely disseminated and used.

Someone inquired about how to achieve a CSS animated gradient text effect similar to ChatGPT's "Searching the web," and while the code provided by ChatGPT can achieve the effect, it contains many redundant and outdated parts. For instance, the code uses properties like -webkit-text-fill-color and background-clip, which were used in the past for compatibility with certain browsers but are no longer necessary today. The author points out that modern browsers widely support background-clip: text, eliminating the need to set text opacity through -webkit-text-fill-color; ChatGPT's approach not only fails to enhance compatibility but also adds unnecessary code.

Tudor further reviews the history of CSS gradient text. As early as 2010, the background-clip property was not widely supported, and developers had to use -webkit-background-clip to achieve gradient text effects, requiring complex CSS property combinations to ensure basic text display in non-WebKit browsers. However, over time, modern browsers have fully supported these features, and past compatibility solutions are no longer needed, yet tools like ChatGPT still generate this outdated code.

Tudor also notes that the way ChatGPT sets background-size and background-position in code appears redundant. In the past, some browsers might not have supported setting background-size in shorthand, but this issue has long been resolved. Additionally, for horizontal gradients, the second value of background-size (height) is irrelevant; whether set to 200% or auto, the visual effect remains the same.

This issue of generating outdated code is not unique to ChatGPT. Other AI tools like Gemini also exhibit similar problems, often mixing old popular solutions with modern CSS to produce a "grotesque hybrid." Tudor emphasizes that developers need to be more cautious when using AI-generated code, not merely relying on AI outputs but rather optimizing code in line with the latest technical standards and practices.

#AI #Thoughts

ChatGPT and the proliferation of obsolete and broken solutions to problems we hadn’t had for over half a decade before its launch

Pure HTML and CSS Implementation of Minecraft.#

I looked at HTML, and it seems quite large; it might run a bit sluggish.

#Frontend #CSS

CSS Minecraft

Open Source Society University: A Free Path to Self-Learning CS (Computer Science).#

#Courses

ossu/computer-science

🎓 Path to a free self-taught education in Computer Science!

HTML18476823172

Finding Your Life Direction#

Jessica Livingston shares her speech at the Bucknell University 2025 graduation ceremony. She reflects on her state of confusion upon graduating from college, where she had an English degree but no clear career plan and had not found anything she was truly interested in, spending ten years finding her direction. She hopes to help graduates find their goals more quickly, especially those eager for ambitious plans but yet to find their direction.

Most people progress along a predetermined track during their growth, such as elementary school, middle school, high school, and college, leading them to mistakenly believe that each stage of life has a clear next step. However, graduating from college marks the end of this "track," and people need to realize that from this moment on, they can freely choose any direction. While this freedom is exciting, it also terrifies many, leading them to seek new "tracks," such as finding a job at a well-known company, even if that job does not truly attract them.

Graduates can redefine themselves at this juncture of graduation. Many may lack confidence due to past academic performance or experiences, but she encourages everyone not to be limited by these constraints, as others do not know their past. If they want to change, they can start now, becoming more curious, responsible, or energetic; no one will stop this transformation.

She shares her experience of choosing a job she was not interested in after graduation, where she was simply happy that someone was willing to pay her. There are countless career options after college graduation, which she did not realize at the time. She advises graduates to actively explore these options rather than passively accept the first opportunity. The best way to narrow down choices is through conversations. She suggests talking to different people to understand what they do, and if they find that their environment or job does not align with those around them, they should decisively leave.

She illustrates this point through her own experience. She found excitement in things related to entrepreneurship and decided to write a book about it. Although many expressed skepticism about her project, she was not swayed by negative feedback. To achieve ambitious plans, one must learn to resist others' doubts and rejections. She cites the founding of Y Combinator as an example, showing that even if initially regarded as a joke, as long as one believes in their idea, success can ultimately be achieved.

Jessica Livingston encourages graduates to take proactive control of their life direction and not go with the flow. She suggests finding the path that suits them best through conversations with interesting people. She hopes graduates will remember that while there are many choices in life, building connections with others can lead to discovering truly exciting and engaging careers.

#Thoughts #Entrepreneurship

Find Your People

Defuddle#

Defuddle aims to extract the main content from web pages, removing unnecessary elements such as comments, sidebars, headers, footers, etc., to generate a clean and readable HTML document.

The core functionality of Defuddle is to analyze the DOM structure and style information of web pages to identify and retain the main text content while removing irrelevant distracting elements. It supports various input methods, including parsing directly from the web page DOM, parsing from HTML strings, and parsing from URLs. Additionally, Defuddle offers rich configuration options, allowing users to adjust parsing behavior as needed, such as enabling debug mode, converting to Markdown format, and retaining specific HTML attributes.

In terms of technical implementation, Defuddle provides three different packaged versions: the core version (defuddle), suitable for browser environments with no additional dependencies; the full version (defuddle/full), which includes additional mathematical formula parsing capabilities; and the Node.js version (defuddle/node), optimized for Node.js environments, supporting full feature sets. Defuddle is written in TypeScript, with a clear code structure that is easy to extend and maintain.

The output of Defuddle is an object containing various metadata, such as article titles, authors, descriptions, publication dates, word counts, etc., while also providing standardized HTML content. It standardizes HTML elements through a series of rules, such as converting H1 to H2, removing line numbers and syntax highlighting from code blocks, unifying footnote formats, and converting mathematical formulas to standard MathML format.

#Tools

kepano/defuddle

Extract the main content from web pages.

TypeScript209749

Supermemory#

Supermemory helps users build their own "second brain," providing ChatGPT-like intelligent features for bookmarks. Users can import tweets or save websites and content through a Chrome extension.

The core functionality of Supermemory is to provide a powerful developer-friendly API for AI applications, seamlessly integrating external knowledge as a perfect memory layer for the AI stack, offering semantic search and retrieval capabilities to enhance the model's relevant context. Users can utilize it to store and organize knowledge, find information based on meaning rather than just keyword matching through a semantic search engine, and connect to any data source, such as websites, PDFs, images, etc.

Supermemory has a wide range of application scenarios, including enhancing retrieval-augmented generation (RAG) of LLM outputs, creating intelligent searchable knowledge bases and documents, building chatbots with access to supporting documents, querying research assistants across papers, notes, and references, and semantically organizing and retrieving multimedia content in content management systems.

#AI #Tools

Introduction - supermemory | Memory API for the AI era