Friday, December 12, 2025

Bending Light: Simulating General Relativity in the Browser with Raw WebGL

As software engineers, we often work within the comfortable constraints of Euclidean geometry: grid layouts, vector positions, and linear interpolations. But what happens when the coordinate system itself is warped?

Recently, I built a real-time visualization of a Schwarzschild black hole using raw WebGL and JavaScript. The goal wasn't just to create a pretty image, but to solve a complex rendering problem: How do you perform ray casting when light rays don't travel in straight lines?

This project explores the intersection of high-performance graphics, general relativity, and orbital mechanics, all running at 60 FPS in a standard web browser.

The Challenge: Non-Euclidean Rendering

Standard 3D rendering (rasterization or standard ray tracing) assumes light travels linearly from a source to the camera. In the vicinity of a black hole, extreme gravity bends spacetime. Light follows geodesics—curves defined by the spacetime metric.

To visualize this, we cannot use standard polygon rasterization. Instead, we must utilize Ray Marching within a Fragment Shader, solving the path of every pixel mathematically in real-time.

Architecture: The "No-Framework" Approach

While libraries like Three.js are excellent for production 3D apps, I chose raw WebGL API for this simulation.

Why?

  1. Performance Control: I needed direct control over the GLSL rendering pipeline without overhead.

  2. First-Principles Engineering: Understanding the low-level buffer binding and shader compilation ensures we aren't relying on "black box" abstractions for critical math.

  3. Portability: The entire engine runs in a single HTML file with zero dependencies.

The Physics Stack

1. Relativistic Ray Marching (The Shader)

The core logic lives in the Fragment Shader. For every pixel on the screen, we fire a ray. However, instead of a simple vector addition (pos += dir * step), we apply a gravitational deflection force at every step of the march.

We approximate the Schwarzschild metric by modifying the ray's direction vector ($\vec{D}$) based on its distance ($r$) from the singularity:

// Inside the Ray Marching Loop

float stepSize = 0.08 * distToCenter; // Adaptive stepping

vec3 gravityForce = -normalize(rayPos) * (1.5 / (distToCenter * distToCenter));

// Bend the light

rayDir = normalize(rayDir + gravityForce * stepSize * 0.5); 

rayPos += rayDir * stepSize;

Optimization Strategy: Note the stepSize. I implemented adaptive ray marching. We take large steps when the photon is far from the black hole (low compute cost) and micro-steps when close to the event horizon (high precision required). This keeps the loop iteration count low (~120 passes) while maintaining visual fidelity.

2. The Event Horizon & Accretion Disk

We define the Schwarzschild Radius ($R_s$).

  • If a ray's distance drops below $R_s$, it is trapped. The pixel returns black (the shadow).

  • If the ray intersects the equatorial plane ($y \approx 0$) within specific radii, we render the Accretion Disk.

To simulate the Doppler Beaming effect (where the disk looks brighter on the side moving toward the camera), I calculated the dot product of the disk's rotational velocity and the ray direction.

3. Keplerian Orbital Mechanics (The JavaScript)

The background stars aren't just animating on a linear path. They follow Kepler’s 3rd Law of Planetary Motion ($T^2 \propto r^3$).

I engineered the JavaScript layer to handle the state management of the celestial bodies:

  • Blue Giant: Far orbit, period set to exactly 10.0s.

  • Red Dwarf: Near orbit.

  • Math: I calculated the inner star's period dynamically based on the ratio of the semi-major axes to ensure physical plausibility.

// Keplerian Ratio Preserved float r2 = 7.5; float ratio = pow(r2 / r1, 1.5); // T^2 proportional to r^3 float T2 = T1 * ratio;

Procedural Generation: Noise without Textures

Loading external textures introduces HTTP requests and cross-origin issues. To keep the architecture monolithic and fast, I generated the starfield and galaxy band procedurally using GLSL hash functions (pseudo-random noise).

By manipulating the frequency and amplitude of the noise, I created a "soft" nebula effect that adds depth without the GPU cost of high-res texture sampling.

UX: Spherical Camera System

A visualization is only useful if it can be explored. I implemented a custom camera system based on Spherical Coordinates ($\rho, \theta, \phi$) rather than Cartesian vectors. This prevents "gimbal lock" and allows the user to orbit the singularity smoothly using mouse drags or touch gestures.

The state is logged to the console for debugging specific views:

Camera Pos: [-3.34, 0.10, -12.06], Zoom (Radius): 12.51

Conclusion

This project demonstrates that modern browsers are capable of heavy scientific visualization if we optimize the pipeline correctly. By combining physics-based rendering, adaptive algorithms, and low-level WebGL, we can simulate general relativity in real-time on consumer hardware.

Thursday, December 11, 2025

Central Limit Theorem visualization using JS

Realistic Galton Board

Physics-based collisions demonstrating the Central Limit Theorem

Balls Finished: 0

Heisenberg Uncertainty Principle demonstration

Heisenberg Uncertainty Principle Demo

Heisenberg Uncertainty Principle

Position Certainty 50%
Momentum Certainty 50%

Δx · Δp ≥ ℏ/2

Drag the slider to observe the trade-off:

  • Right: We clamp down on Position. The dot becomes sharp (we know where it is), but its Velocity becomes chaotic and unpredictable.
  • Left: We measure Momentum. The movement becomes smooth and predictable, but the particle expands into a fuzzy cloud (we don't know exactly where it is).

My notes on Designing Data-Intensive Applications

 1) Reliable, Scalable, and Maintainable Applications

DDIA opens by reframing reliability as “works correctly under adversity,” not “never fails.” It classifies faults (hardware, software, and—most common—human), argues for fault tolerance over fault prevention, and stresses measuring what matters: load (domain-specific parameters) and performance (especially latency percentiles and tail latency). It distinguishes scalability strategies (vertical vs. horizontal, elasticity) and frames maintainability as a first-class goal grounded in operability, simplicity, and evolvability. This lens anchors every technology choice in the book. 

Most surprising. How strongly the chapter elevates operability—tooling, visibility, and safe procedures—as part of system design, not an afterthought.

Biggest learning. Design APIs and dataflows to limit blast radius of human error; it’s the dominant failure mode in production. 


2) Data Models and Query Languages

The chapter compares relational, document, and graph models and shows how model choice shapes thinking, queryability, and change over time. It revisits NoSQL’s origins (scale, schema flexibility, special queries) and shows the later convergence (RDBMS with JSON; some document stores supporting joins). It contrasts declarative vs. imperative query styles (SQL, Cypher, SPARQL, Datalog vs. Gremlin/MapReduce), emphasizing that declarative queries enable richer, engine-level optimization. 

Most surprising. Datalog’s expressiveness and composability—rarely used in industry, but a clean mental model for complex relationships. 

Biggest learning. Pick the model that minimizes mismatch with access patterns; you can combine models, but every bridge you build becomes maintenance


3) Storage and Retrieval

Under the hood, databases rely on data structures like hash indexes, B-trees, and LSM-trees. B-trees give predictable point/range lookups with in-place updates; LSM-trees trade read amplification for high write throughput via compaction (SSTables). The chapter also separates OLTP vs. analytics: data warehouses, star/snowflake schemas, columnar storage, compression, sort orders, and materialized aggregates—each optimized for large scans and aggregations. The practical takeaway: your index and storage layout are workload contracts.

Most surprising. How dramatically compaction strategies (e.g., LSM) shift write/read trade-offs; “fast writes” are never free.

Biggest learning. Treat columnar storage and sort order as first-class design levers for analytics; they often beat “more CPU” by orders of magnitude.


4) Encoding and Evolution

Serialization is a compatibility contract: JSON/XML vs. binary formats like Protocol Buffers, Thrift, and Avro. DDIA explains schema evolution—backward/forward compatibility, field defaults, and name-based matching (not position) in Avro—and compares dataflow modes: via databases (schemas), services (RPC/REST), and asynchronous logs/queues (message passing, change data capture). The chapter’s design lens is “how do we change in production without breaking consumers?”

Most surprising. Practical details like Avro’s reader/writer schema negotiation and the hidden 33% size overhead of Base64 that many pipelines forget.

Biggest learning. Treat schemas like APIs and roll out change with compatibility plans across every hop in your dataflow—DBs, services, and streams.


5) Replication

Why replicate? Availability, locality, throughput. DDIA surveys leader–follower (sync/async), multi-leader, and leaderless (quorum-based) designs, then digs into the pain: replication lag and read phenomena (read-your-writes, monotonic reads, consistent prefix), write conflicts, failover safety, hinted handoff, and concurrent write resolution. Each topology optimizes a different corner of the CAP space and operational reality. The headline: choose replication strategy to match write patterns, failure semantics, and geo constraints—and be explicit about read guarantees.

Most surprising. Quorum reads/writes do not imply linearizability under partitions and delays—your “consistent” system may still return surprising results.

Biggest learning. Specify user-visible guarantees (RYW/monotonic/consistent prefix) and instrument them; otherwise, lag will surface as correctness bugs.


6) Partitioning

Partition (shard) to scale writes and storage: by key-range (good for range scans, vulnerable to hotspots) or hash (great balance, sacrifices locality). Then reconcile secondary indexes across partitions (by document vs. by term), plan for skew mitigation, and decide on rebalancing (automatic vs. manual) and request routing. Parallel queries traverse shards; operational playbooks must keep routing tables and ownership consistent during moves. Partitioning is where data model meets operability.

Most surprising. The operational complexity of global secondary indexes across shards—fast reads now create hard rebalancing problems later.

Biggest learning. Design hot-key mitigation (key salting, time-bucketing, or write fan-in) on day one; you will need it.


7) Transactions

ACID is not one thing; it is a family of behaviors, and many systems ship with weak isolation by default. DDIA walks through anomalies (dirty reads/writes, lost updates, write skew, phantoms) and the isolation continuum: Read Committed, Snapshot Isolation, Repeatable Read, and full Serializability—implemented via 2PL or Serializable Snapshot Isolation (SSI). It also covers two-phase commit (2PC) and its realities. The advice: pick isolation deliberately for each workload; measure, don’t assume.

Most surprising. SQL isolation names are historically inconsistent; “Repeatable Read” may be snapshot isolation, not serializable, depending on the engine.

Biggest learning. SSI provides a pragmatic path to serializable behavior without the worst 2PL downsides—if you accept aborts and tune for them.


8) The Trouble with Distributed Systems

Distributed systems fail partially: networks drop, reorder, and delay messages; processes pause (GC); clocks drift; timeouts are ambiguous; byzantine behavior exists. You cannot know if a timed-out request was processed. DDIA demystifies clocks (time-of-day vs. monotonic), fault detection limits, and why synchronized clocks are shaky foundations. The mindset shift: design for gray areas with idempotency, retries, and explicit uncertainty handling.

Most surprising. “Timeouts don’t tell you what happened”—even mature teams forget this in write paths and background jobs.

Biggest learning. Build protocols around idempotence, fencing tokens, and compensations; correctness must survive duplicate or reordered messages.


9) Consistency and Consensus

This chapter untangles consistency models: linearizability (single-copy illusion), causal ordering, and total order broadcast; then connects them to distributed transactions (2PC) and fault-tolerant consensus (e.g., Paxos/Raft) and coordination services (ZooKeeper/etcd). It emphasizes the cost of linearizability and when you actually need it (e.g., uniqueness constraints, leader election). The core is choosing the weakest model that preserves invariants.

Most surprising. Quorum configurations can still be non-linearizable; ordering and causality matter as much as “how many nodes agreed.”

Biggest learning. Externalize coordination to a dedicated, well-understood service; don’t reinvent ad-hoc consensus inside your business logic.


10) Batch Processing

DDIA traces batch from Unix pipes to MapReduce to DAG engines (Spark, Tez, Flink): move compute to data; materialize intermediate state; join at scale (reduce-side vs. map-side); and design for skew. The point is not fetishizing frameworks but understanding when a recomputation model is simpler and more correct. Batch complements services and streams by providing determinism, reprocessing, and backfills over immutable logs. 

Most surprising. Many “real-time” features are healthier as overnight recomputations with well-tracked freshness SLAs. 

Biggest learning. Make the log your source of truth; batch is how you rebuild correct derived state when models or code change.


11) Stream Processing

Streams turn events into state via partitioned logs, CDC, and event sourcing. The chapter explains processing time vs. event time, watermarks, windowing, stateful operators, stream joins, and fault tolerance via checkpoints and idempotence. Systems like Kafka + stream processors keep services and analytics in sync by replaying rather than mutating state in place. The broader message: unify batch and stream around the same log. 

Most surprising. “Exactly once” is a protocol property (idempotence + atomic commit/checkpointing), not magic—understand where duplicates are eliminated.

Biggest learning. Embrace immutable event logs and derive all read models from them; recovery becomes replay, not surgery.


12) The Future of Data Systems

The finale argues for derived data and unbundled databases: compose specialized stores (OLTP, search, analytics, cache) with dataflow—CDC, validation, constraints, and observability—to maintain correctness across systems. It emphasizes end-to-end arguments for guarantees, trust-but-verify checks, and the ethics of data use (timeliness, integrity, privacy). The north star is building applications around dataflow instead of around databases, so recomputation and evolution are routine.

Most surprising. Treating the database as just another consumer of the log reframes integration and unlocks simpler, auditable architectures.

Biggest learning. Product correctness is a system property across boundaries; codify constraints and verification in the dataflow—not only in one database.


Closing takeaways

  • Make guarantees explicit (isolation level, read semantics, freshness) and monitor them like SLOs, not folklore. 

  • Choose the weakest consistency that preserves invariants; spend linearizability budget only where invariant violations are existential.

  • Architect around an immutable log; let batch and stream recompute derived views safely. 

Tuesday, September 9, 2025

When AI “fixes” tests, quality breaks

Treating tests as disposable artifacts that an AI can write or “auto‑repair” turns a core quality practice into a vanity metric. Tests are executable specifications; they encode intent. When an AI modifies failing tests to match the implementation, the direction of alignment flips—code no longer conforms to the spec; the spec is rewritten to bless whatever the code just did. That erodes trust in the suite.

Thursday, June 19, 2025

No AI at coding interviews: Prioritizing algorithmic thinking

When evaluating senior engineer candidates, it's vital to focus on pure algorithmic thinking rather than their ability to manipulate AI tools. The essence of a senior-level engineering role rests on creative problem-solving, thoughtful reasoning, and an in-depth understanding of algorithms. Relying on AI during live coding assessments risks masking a candidate's true capability. If a candidate solely depends on AI-generated solutions, it becomes difficult to gauge whether they possess the foundational skills necessary for high-level design and complex decision-making.

There’s a compelling argument for designing interviews that first filter for candidates who demonstrate strong independent cognitive processes. Once that baseline is established, integrating AI coding tools into later evaluation stages—such as onboarding or specialized training programs—can be a valuable extension. This phased approach ensures that candidates have proven their capacity for raw problem-solving and originality before leveraging AI as a productivity tool. In this way, the interview process remains a pure measure of human ingenuity while aligning with an evolving, AI-supported workplace culture.

Critics of a strict no-AI policy in technical interviews point out that the modern workplace increasingly relies on AI to boost productivity and streamline tasks. They argue that an “AI-first” culture is not just the future, but the present, necessitating that candidates show proficiency in guiding AI to produce optimal results. However, there is a clear distinction between using AI for supplementary functions, such as refreshing syntax knowledge or aiding in non-critical tasks, and relying on it to do the heavy cognitive lifting of problem solving. A candidate who enhances their thinking with AI might still possess excellent strategic skills, but the immediate challenge remains: identifying true algorithmic talent without any external computational aid.

The ideal hiring process should strike a balance—ensuring that candidates can solve tasks independently while gradually introducing AI-based tools as part of their professional development. This approach not only validates their deep-rooted technical abilities but also ensures that they are well-equipped to integrate with a future-facing, AI-augmented work environment. Ultimately, the best engineers will be those who showcase exceptional problem-solving prowess first and then learn to harness AI to amplify their contributions. 

Thursday, June 5, 2025

AI agents: Data, Action and Orchestration

Let's explore how modern AI agents are built using three broad categories of tools—Data, Action, and Orchestration—and then see how these relate to concepts like retrieval augmented generation (RAG) and memory, along with the collaborative architectures of multi-agent systems.

1. Types of AI Agent Tools

  • Data Tools Data tools are the workhorses for gathering, processing, and refining information. They enable agents to query databases, retrieve web data, or parse documents, creating the critical knowledge base an AI uses to generate insights. In many applications, these systems incorporate retrieval augmented generation (RAG) techniques. Traditional RAG methods pull in relevant external information to support generative models, while agentic RAG goes further by granting the agent autonomy—allowing it to decide when and what additional data to retrieve based on its goals. This dynamic retrieval makes the process not only reactive but also proactive in achieving task-specific outcomes.

  • Action Tools Action tools empower an AI to affect change in its environment. They include APIs, control systems, or robotic actuators that transform decisions into real-world or digital actions. These tools enable the agent not just to talk about tasks but to execute them: from sending messages and updating records to controlling machines or interfacing with other software systems.

  • Orchestration Tools When complex tasks require multiple specialized agents or modules, orchestration tools step in as the conductors of the system. They manage workflows, coordinate tasks among various sub-agents, and ensure that all actions align with the overarching goals. This coordination is crucial in systems where different processes must work together seamlessly—whether they’re organized in a hierarchy or running in parallel.

2. Data vs. Agentic RAG

In traditional RAG, the agent uses a fixed strategy to retrieve information—essentially tapping into a data reservoir to enhance its responses. In contrast, agentic RAG embodies a more dynamic approach. Here, the agent actively determines what to fetch, when to fetch it, and how best to integrate that data into its outputs. This level of agency means the retrieval process adapts in real time, tailored to the context of the ongoing task and the agent’s evolving objectives.

3. Memory in AI Agents

Memory plays a pivotal role in the effectiveness of AI systems:

  • Short-Term Memory This is the agent’s active workspace. It holds context during a conversation or session, enabling the AI to keep track of the immediate flow of information and maintain coherent interactions over the span of a single task or dialogue.

  • Long-Term Memory In contrast, long-term memory allows the agent to store information across multiple sessions. This could include user preferences, historical interactions, or accumulated experience. By leveraging long-term memory, AI agents can offer more personalized service over time and adapt their behavior based on past interactions.

4. Multi-Agent Architectures

Modern AI systems often deploy collaborative architectures to manage complex tasks:

  • Multi-Agent Crews These are teams of agents that work on different aspects of a problem concurrently. Each agent may specialize in a particular domain, and together they pool their expertise to handle intricate or multifaceted challenges.

  • Hierarchical Systems In a hierarchical setup, agents are arranged in layers. Higher-level agents set strategic goals and delegate tasks to lower-level ones, which handle more granular, tactical operations. This structure mirrors organizational hierarchies where executive decisions filter down to operational actions.

  • Parallel Agents With parallel agents, multiple entities operate concurrently and independently on different subtasks. Their outputs are later integrated, enabling the overall system to perform large-scale tasks efficiently without the bottlenecks of serial processing.

Bringing It All Together

By combining data tools (especially those enhanced by agentic RAG), action tools, and orchestration tools, modern AI agents achieve a balance between data-driven insight and effective execution. Integrating short-term and long-term memory enhances contextual understanding and personalization, while multi-agent architectures—whether it be multi-agent crews, hierarchical systems, or parallel agents—allow for scalable and adaptive problem-solving. These design principles are key in enabling AI to operate with both autonomy and cohesion, dynamically adjusting its strategy to meet evolving challenges.

Monday, February 10, 2025

DeepSeek R1: A Path to Advanced Reinforcement Learning

DeepSeek R1 embarks on its journey with the 'zero path' approach, a concept that highlights the system's initial state devoid of any pre-existing knowledge or training. This 'tabula rasa' provides a unique perspective on the capabilities and potential of RL systems when they start from scratch. It sets the foundation for the exploration and learning process that follows, emphasizing the importance of initial conditions in the RL setup.

The foundation of DeepSeek R1's success lies in its meticulous reinforcement learning setup. This phase involves defining the environment, reward functions, and the agents' actions and observations. The setup serves as the playground where the agents interact, learn from their actions, and optimize their strategies to maximize rewards. This section delves into the technical aspects of creating a robust RL environment that fosters effective learning and adaptation.

A standout feature of DeepSeek R1 is its innovative Group Relative Policy Optimization (GRPO) algorithm. GRPO introduces a novel approach to policy optimization by leveraging the relative performance of agent groups. Instead of relying solely on individual performance metrics, GRPO considers the collective performance of agent groups, leading to more stable and efficient policy updates. This section explores the mechanics of GRPO, its advantages, and its impact on the learning process.

The results of DeepSeek R1-zero are a testament to the system's capabilities and the effectiveness of its methodologies. This section presents a comprehensive analysis of the outcomes, highlighting key performance metrics, comparative results, and notable achievements. The data showcases the system's ability to learn, adapt, and optimize its strategies in diverse environments, providing valuable insights into the potential of RL systems.

To enhance the learning process, DeepSeek R1 incorporates a cold start supervised fine-tuning phase. This approach leverages supervised learning techniques to provide a head start to the RL agents. By pre-training the agents on a subset of the environment, the system accelerates the learning process and improves initial performance. This section examines the rationale behind cold start supervised fine-tuning and its impact on the overall learning curve.

Consistency reward for Chain-of-Thought (CoT) is another innovative technique employed by DeepSeek R1. CoT emphasizes the importance of maintaining consistency in decision-making processes, ensuring that the agents' actions align with their long-term strategies. By incorporating a consistency reward mechanism, DeepSeek R1 encourages agents to develop coherent and strategic thought processes. This section explores the implementation and benefits of CoT in the RL framework.

Generating high-quality data for supervised fine-tuning is a critical aspect of DeepSeek R1's success. This phase involves creating diverse and representative datasets that capture various scenarios and challenges within the RL environment. The generated data serves as the foundation for supervised learning, enabling the agents to develop a strong baseline knowledge. This section discusses the methodologies and considerations involved in data generation for supervised fine-tuning.

DeepSeek R1 takes reinforcement learning to the next level by incorporating a neural reward model. This model leverages neural networks to predict and assign rewards based on the agents' actions and states. The neural reward model enhances the system's ability to learn complex and dynamic reward structures, leading to more sophisticated and effective strategies. This section delves into the architecture and implementation of the neural reward model in the RL framework.

The distillation phase in DeepSeek R1 plays a crucial role in refining and optimizing the learned policies. Distillation involves transferring knowledge from a high-capacity model to a more compact and efficient model, ensuring that the distilled model retains the essential knowledge and performance characteristics of the original. This section explores the distillation process, its benefits, and its impact on the overall efficiency and scalability of DeepSeek R1.

DeepSeek R1 represents a significant advancement in the field of reinforcement learning, showcasing a comprehensive and innovative approach to policy optimization and learning. From its zero path beginnings to the incorporation of cutting-edge techniques like GRPO, CoT, and neural reward models, DeepSeek R1 exemplifies the potential of RL systems in tackling complex challenges and achieving remarkable results. As the field continues to evolve, DeepSeek R1 stands as a testament to the power of innovation and meticulous design in the pursuit of advanced artificial intelligence.