<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
    <channel>
        <title>Gemini - Tag - Simi Studio</title>
        <link>/en/tags/gemini/</link>
        <description>Gemini - Tag - Simi Studio</description>
        <generator>Hugo -- gohugo.io</generator><language>en</language><managingEditor>simi@simi.studio (Simi)</managingEditor>
            <webMaster>simi@simi.studio (Simi)</webMaster><lastBuildDate>Tue, 10 Mar 2026 14:30:00 &#43;0800</lastBuildDate><atom:link href="/en/tags/gemini/" rel="self" type="application/rss+xml" /><item>
    <title>Have Multimodal LLMs Matured: Early 2026 Real Testing</title>
    <link>/en/posts/multimodal-llm-evolution/</link>
    <pubDate>Tue, 10 Mar 2026 14:30:00 &#43;0800</pubDate>
    <author>simi@simi.studio (Simi)</author>
    <guid>/en/posts/multimodal-llm-evolution/</guid>
    <description><![CDATA[GPT-4o, Gemini 2.0, Claude 3.7 all support multimodal. Image, audio, video understanding—which is strongest? Real testing results in this article.]]></description>
</item>
<item>
    <title>Computer Use Agent Roundup: Claude, GPT-4o, Gemini - Who Controls Computers Best</title>
    <link>/en/posts/computer-use-agent-analysis/</link>
    <pubDate>Sun, 01 Mar 2026 10:00:00 &#43;0800</pubDate>
    <author>simi@simi.studio (Simi)</author>
    <guid>/en/posts/computer-use-agent-analysis/</guid>
    <description><![CDATA[Computer Use (AI directly controlling computers) is the hottest direction in 2026. Anthropic, OpenAI, Google all released solutions. This article横向对比实测结果.]]></description>
</item>
<item>
    <title>Gemini 3.1 Pro: 77.1% on ARC-AGI-2, Hallucination Rate Dropped from 88% to 44%</title>
    <link>/en/posts/gemini-3-1-pro-arc-agi/</link>
    <pubDate>Fri, 20 Feb 2026 10:00:00 &#43;0800</pubDate>
    <author>simi@simi.studio (Simi)</author>
    <guid>/en/posts/gemini-3-1-pro-arc-agi/</guid>
    <description><![CDATA[February 20, 2026—Google releases Gemini 3.1 Pro, scoring 77.1% on ARC-AGI-2 (2x the previous generation) while cutting hallucination rate from 88% to 44%.]]></description>
</item>
<item>
    <title>Gemini 3 Deep Think: 84.6% on ARC-AGI-2, 0.4% Shy of the AGI Signal Threshold</title>
    <link>/en/posts/gemini-3-deep-think-arc-agi/</link>
    <pubDate>Fri, 13 Feb 2026 10:00:00 &#43;0800</pubDate>
    <author>simi@simi.studio (Simi)</author>
    <guid>/en/posts/gemini-3-deep-think-arc-agi/</guid>
    <description><![CDATA[February 13, 2026—Google releases Gemini 3 Deep Think mode, scoring 84.6% on ARC-AGI-2, just 0.4% below the ARC Prize "strong AGI signal" threshold of 85%.]]></description>
</item>
<item>
    <title>Gemini 2.0 Flash Thinking: How Does Google&#39;s Coding Stack Up</title>
    <link>/en/posts/gemini-2-analysis/</link>
    <pubDate>Thu, 05 Feb 2026 10:30:00 &#43;0800</pubDate>
    <author>simi@simi.studio (Simi)</author>
    <guid>/en/posts/gemini-2-analysis/</guid>
    <description><![CDATA[Google Gemini 2.0 launched Flash Thinking mode—how does it perform on coding tasks? This article gives an objective evaluation after real testing.]]></description>
</item>
<item>
    <title>Gemini Reasoner: First Model to Surpass Human Average on Complex Reasoning</title>
    <link>/en/posts/gemini-reasoner-analysis/</link>
    <pubDate>Mon, 05 Jan 2026 10:00:00 &#43;0800</pubDate>
    <author>simi@simi.studio (Simi)</author>
    <guid>/en/posts/gemini-reasoner-analysis/</guid>
    <description><![CDATA[On January 5, 2026, Google DeepMind released Gemini Reasoner—the first model to systematically outperform human average on complex cross-modal reasoning tasks including scientific hypothesis generation, causal inference, and long-horizon planning.]]></description>
</item>
<item>
    <title>LLM Selection Guide: Claude vs GPT-4o vs Gemini—Which One to Pick</title>
    <link>/en/posts/ai-model-selection-guide/</link>
    <pubDate>Sat, 27 Dec 2025 09:40:00 &#43;0800</pubDate>
    <author>simi@simi.studio (Simi)</author>
    <guid>/en/posts/ai-model-selection-guide/</guid>
    <description><![CDATA[Claude 3.7, GPT-4o, Gemini 2.0—how to choose? A practical framework for model selection.]]></description>
</item>
<item>
    <title>LLM Context Window Race: A Marathon With No Finish Line</title>
    <link>/en/posts/llm-context-window-arms-race/</link>
    <pubDate>Sun, 15 Oct 2023 10:00:00 &#43;0800</pubDate>
    <author>simi@simi.studio (Simi)</author>
    <guid>/en/posts/llm-context-window-arms-race/</guid>
    <description><![CDATA[Mid-2023, Claude pushed context window to 200k, GPT-4 sat at 8k/32k, Gemini hit 1M. Context window size became the metric for judging models. This article explains why it matters and whether you can actually use 200k tokens in practice.]]></description>
</item>
</channel>
</rss>
