{
  "version": 1,
  "records": [
    {
      "id": "blog:ai-is-a-broad-term",
      "type": "blog",
      "title": "AI is Actually a Broad Term",
      "slug": "ai-is-a-broad-term",
      "url": "/blogs/ai-is-a-broad-term.html",
      "sourcePath": "blogs/ai-is-a-broad-term.md",
      "tags": [
        "AI",
        "Machine Learning",
        "Education"
      ],
      "description": "A plain-language map of AI, machine learning, deep learning, LLMs, algorithms, and decision-making systems.",
      "date": "2026-06-02T13:18:40.000Z",
      "text": "AI is Actually a Broad Term A plain-language map of AI, machine learning, deep learning, LLMs, algorithms, and decision-making systems. AI is Actually a Broad Term | Behnam Khorsandian AI is everywhere. From your phone's autocorrect to selfdriving cars, the term “AI” is thrown around so often that it seems to mean anything and everything. But what actually counts as AI? Is a simple calculator AI? What about your spam filter? Or the chatbot you are arguing with everyday? AI is actually a very broad and sometimes misleading term that addresses a range of technologies. Some systems follow basic rules, while others seem almost human like in their ability to understand and generate responses. Understanding these differences is a must in making sense of AI’s role in our lives. Before we start, let’s agree on a simple fact: AI is not magic. It’s just a tool. A tool that ranges from simple automation to highly sophisticated systems. The more we understand these layers, the more we can appreciate AI’s strengths and limitations. To illustrate just how broad the term AI really is, take a look at the diagram below. “Artificial Intelligence” (AI) is the general name we call machines doing tasks that normally require human intelligence. Which means they are \"smarter\" than an average algorithm. It's the difference between a traffic light that works with a fixed timer, and a traffic light that works based on how busy is the cross section. This \"smartness\" has different levels that have been divided based on algorithms level of sophistication, or in other words \"being more similar to us\". Within AI, we have “Machine Learning” (ML), algorithms that learn patterns from data rather than following explicit instructions. Nested within Machine Learning, is “Deep Learning” (DL), which not only learns like human, but uses neural networks (basically a digital version of neurons we have in our brains) with many layers to extract increasingly abstract representations from raw inputs. (we will get to that later)And inside this deep learning world, we find famous “Large Language Models” (LLMs) which we are getting familiar with since November 30, 2022 when the ChatGPT borned. The big paradigm shift in human history. First Things First, What is an Algorithm? ‍Every computer program or digital product you encounter with, has 3 main elements: Input, Process, and the Output. Inputs can be any type of data coming from the user or the environment, words, numbers, images, sensor readings, or even when you press a button on the device. Outputs also are one of these types we have mentioned, or even an action from the device, like a robotic arm. The processing part which determines the output based on the inputs is called Algorithm. If you think about it, everything is an algorithm, even for human beings. This short clip gives the same energy i have about this very matter... Let's begin with a simple example: making a cup of tea. First, you have to turn on the stove, wait for the water to boil, and then pour the hot water on the tea and wait for 2 minutes. Same story goes machines with more or less steps.In computer science we have different type of algorithms: Sorting, Searching, Path Finding, Optimization, Compression and many more, but that is for another blog post, and since this one is about Artificial Intelligence, I'm going to give you an example about Decision Making Algorithms, which captures the concept of Intelligence in machines much better. When we think about intelligence in humans, one of first reasons we call ourselves the intelligence beings, is that we can choose and make decisions not only based on instinct, but also based on a \"logical\" process in our mind we call \"thinking\". Humans gather inputs (our senses), process that information (our thoughts), and produce outputs (our actions). Similarly, machines use sensors, data, and algorithms to make decisions and act. What makes decision making algorithms particularly fascinating is how they mimic human reasoning in a structured, logical way, often following steps like “if this happens, then do that.” Let's talk about another everyday example: a traffic light. In the beginning, the “algorithm” was simply a human decision maker, a police officer standing at the intersection, instructing one side to stop and the other side to go. The Input was the officer’s observation of traffic: how many cars are in each lane, whether there are pedestrians waiting to cross, and so on. The Processing was the officer’s decisionmaking process, simply based on common sense and experience. The Output was just a signal to the drivers. A hand gesture, a whistle, or a traffic sign movement.To reduce the officer’s workload, a simple digital system was introduced: a traffic light controlled by the officer. The officer would observe the traffic (Input) and press buttons to switch the light from red to green or yellow (Processing). The Output was the automated light changing, providing a more visible and consistent signal for drivers.Then thanks to transistors, we removed the need for constant human input by introducing a timer. The Input here was time: a predefined interval for each light (green for 30 seconds, yellow for 5 seconds, and red for 30 seconds). The Processing was a simple algorithm in early versions, just a loop that switched lights after a programed duration. But as we know these days everything is controlled with image processing and more complex systems. the traffic light uses Inputs from sensors or cameras, like the number of cars waiting, the presence of pedestrians, or even the time of day. The Processing involves more sophisticated algorithms which deep down can be a series of rules and conditions that dictate what the light should do based on the current conditions. This is a simple example of our first machine learning algorithm, a Decision Tree. This simple Decision Tree Flowchart tells the traffic light: IF there are cars in the main road AND there is no car on the other one, turn the main road light green, and the other one red. BUT if there were cars on the other side too, just use a timer in a way that balances out the traffic flow. Also be careful if there are pedestrians waiting, give the priority to their signal first. But how do these machines \"understand\" the conditions of the traffic to balance the flow between them? Keep in mind that every decision made by a machine represents a glimpse of how “intelligence” is built into the digital world, inspired by nature and humans. ‍"
    },
    {
      "id": "blog:ai-too-long-temporary-and-full-of-spoilers",
      "type": "blog",
      "title": "AI: Too Long, Temporary & Full of Spoilers",
      "slug": "ai-too-long-temporary-and-full-of-spoilers",
      "url": "/blogs/ai-too-long-temporary-and-full-of-spoilers.html",
      "sourcePath": "blogs/long-and-incompelete.md",
      "tags": [
        "AI",
        "Machine Learning",
        "Longform"
      ],
      "description": "A long-form, temporary overview of AI concepts, history, cognition, machine learning, and the topics later split into shorter posts.",
      "date": "2026-06-02T13:18:40.000Z",
      "text": "AI: Too Long, Temporary & Full of Spoilers A long-form, temporary overview of AI concepts, history, cognition, machine learning, and the topics later split into shorter posts. AI: Too Long, Temporary & Full of Spoilers | Behnam Khorsandian Since this turned into a very long post (35 minutes to Read), I have decided to break it down into smaller, more digestible chunks. I’ll be rewriting the all of the topics covered in this post (with more things to cover in each chapter) and continuing the unfinished discussion in the new posts, I highly recommend to read other posts, but feel free to read this one, it's still pretty comprehensive for newbies in AI. I remember when they told us in school that the only thing that separates us from animals, was the intelligence, and i remember being the smartest species on earth was a huge flex for us and made the whole humanity to think that we own the planet. Well, we have managed to create something that changes everything. We alone created languages, solved complex problems, wrote literature, made art, and built skyscrapers, satellites, and the internet. However in the last few decades, this exclusivity has started to fade. Machines, once limited to printing “Hello World” on a black and white screen, now ignores your request with a simple \"Sorry, i can not assist with that\". Today, artificial intelligence (AI) can compose text that can pass as our own, drive cars more reliably than some humans, and detect diseases from medical images with higher accuracy than many doctors. In fact, once you try to join forces with AI, the results get worse, which means doctors should leave the AI alone and let it do its job, literally (You can read the whole story and the paper here). I would be lying if I say I didn't try to write this post with ChatGPT, but at the end, it just couldn't do it as i like and that shows AI can't replace humans that easily. This is one of the reasons I started this post (will talk about the rest of them later), to show you AI is not a threat and it's just a tool and we should not worry about it. I believe if we understand how these things work, what are the meanings of the buzzwords you hear in the media, and learn about the process behind this magical tool, you will be more interested in using it. After all, humans reached to this point where we are standing right now, simply because they used their tools. So before you panic and start planning for the robot apocalypse, let's see what's actually happening inside the \"mind\" of these digital beings we've created... \"A.I.\" is Actually a Broad Term We start with a bigpicture view: “Artificial Intelligence” is the general name we call machines doing tasks that normally require human intelligence. Which means they are \"smarter\" than an average algorithm. It's the difference between a traffic light that works with a fixed timer, and a traffic light that works based on how busy is the cross section. This \"smartness\" has different levels that have been divided based on algorithms level of sophistication, or in other words \"being more similar to us\". Within AI, we have “Machine Learning” (ML), algorithms that learn patterns from data rather than following explicit instructions. Nested within Machine Learning, is “Deep Learning” (DL), which not only learns like human, but uses neural networks (basically a digital version of neurons we have in our brains) with many layers to extract increasingly abstract representations from raw inputs. (we will get to that later) And inside this deep learning world, we find famous “Large Language Models” (LLMs) which we are getting familiar with since November 30, 2022 when the ChatGPT borned. The big paradigm shift in human history... AI can be literally any product on this map ‍ These layers aren’t just random marketing terms, they help us understand how computers evolved from simple rulefollowing devices (like the traffic light) to systems capable of recognizing faces, translating languages, and generating A+ essays. Just as humans rely on multiple levels of cognition senses, intuition, and reasoning, AI stacks various levels of abstraction to achieve remarkable feats. Actually this is how we are going to move forward in this post, comparing human beings to AI, because we are not that different. So let's start from the bigger scope and narrow it down... First Things First, What is an Algorithm? Every computer program or digital product you encounter with, has 3 main elements: Input, Process, and the Output. Inputs can be any type of data coming from the user or the environment, words, numbers, images, sensor readings, or even when you press a button on the device. Outputs also are one of these types we have mentioned, or even an action from the device, like a robotic arm. The processing part which determines the output based on the inputs is called Algorithm. If you think about it, everything is an algorithm, even for human beings. Let's begin with a simple example: making a cup of tea. First, you have to turn on the stove, wait for the water to boil, and then pour the hot water on the tea and wait for 2 minutes. Same story goes machines with more or less steps. In computer science we have different type of algorithms: Sorting, Searching, Path Finding, Optimization, Compression and many more, but that is for another blog post, and since this one is about Artificial Intelligence, I'm going to give you an example about Decision Making Algorithms, which captures the concept of Intelligence in machines much better. When we think about intelligence in humans, one of first reasons we call ourselves the intelligence beings, is that we can choose and make decisions not only based on instinct, but also based on a \"logical\" process in our mind we call \"thinking\". Humans gather inputs (our senses), process that information (our thoughts), and produce outputs (our actions). Similarly, machines use sensors, data, and algorithms to make decisions and act. What makes decision making algorithms particularly fascinating is how they mimic human reasoning in a structured, logical way, often following steps like “if this happens, then do that.” Engineering Problem Solving Let's talk about another everyday example: a traffic light. In the beginning, the “algorithm” was simply a human decision maker, a police officer standing at the intersection, instructing one side to stop and the other side to go. The Input was the officer’s observation of traffic: how many cars are in each lane, whether there are pedestrians waiting to cross, and so on. The Processing was the officer’s decisionmaking process, simply based on common sense and experience. The Output was just a signal to the drivers. A hand gesture, a whistle, or a traffic sign movement. To reduce the officer’s workload, a simple digital system was introduced: a traffic light controlled by the officer. The officer would observe the traffic (Input) and press buttons to switch the light from red to green or yellow (Processing). The Output was the automated light changing, providing a more visible and consistent signal for drivers. Then thanks to transistors, we removed the need for constant human input by introducing a timer. The Input here was time: a predefined interval for each light (green for 30 seconds, yellow for 5 seconds, and red for 30 seconds). The Processing was a simple algorithm in early versions, just a loop that switched lights after a programed duration. But as we know these days everything is controlled with image processing and more complex systems. the traffic light uses Inputs from sensors or cameras, like the number of cars waiting, the presence of pedestrians, or even the time of day. The Processing involves more sophisticated algorithms which deep down can be a series of rules and conditions that dictate what the light should do based on the current conditions. This is a simple example of our first machine learning algorithm, a Decision Tree. This simple Decision Tree Flowchart tells the traffic light: IF there are cars in the main road AND there is no car on the other one, turn the main road light green, and the other one red. BUT if there were cars on the other side too, just use a timer in a way that balances out the traffic flow. Also be careful if there are pedestrians waiting, give the priority to their signal first. But how do these machines \"understand\" the conditions of the traffic to balance the flow between them? Keep in mind that every decision made by a machine represents a glimpse of how “intelligence” is built into the digital world, inspired by nature and humans. So it worth to take a short detour and talk about the concept of Understanding\" in humans and machine... How the mind works? Before getting deeper into how machines “think,” let’s revisit on our own cognition first. We process a continuous stream of sensory data sights, sounds, smells, tastes, and touches and these signals travel into our minds. Psychologists like Daniel Kahneman describe our thinking in terms of two systems: ‍System 1 and System 2. System 1 is fast, intuitive, and emotiondriven, often working automatically without conscious effort. System 2 is slower, more deliberate, and logical, engaging when we need to carefully reason through a problem, focus attention, or handle more complex tasks. While this framework was originally intended to help us understand human cognition, it’s useful to map these ideas into how AI systems and computational models might operate. I suggest you to learn more about these two systems in this video because it explains it much better than i can, but for the sake of the conversation, here are key takeaways related to our topic: ‍ System 1 (Unconscious Mind) • Intuition and Pattern Recognition: In humans, System 1 quickly recognizes faces, reads simple words, and makes snap judgments based on familiarity and emotion. For machines, a similar effect is the use of trained AI model that can rapidly infer a result once trained, think about AI that can quickly identify an image (like Face ID in your phone), identify a spoken word (like auto generated subtitles on youtube), or suggest the next word in a sentence (like ChatGPT). They don’t “deliberate”; they simply apply the patterns they have previously learned. (and that's why sometimes they don't work properly) • Statistical “Intuition”: Just as our brain’s System 1 relies on heuristics gleaned from past experience, a trained neural network relies on statistical patterns learned from data. Once trained, the network’s forward pass is similar to a System 1 response: it takes an input (for example an image) and quickly produces an output (labeling it as “cat” vs. “dog”) based on vast amounts of prior training. This is fast and efficient but not reflective or logical in a deep sense—it’s recognition, not reasoning. • Heuristics and Biases: Human System 1 is prone to certain biases and errors due to its reliance on heuristics. Similarly, AI models can exhibit biases based on their training data. They may rapidly produce an answer, but if the data is skewed or not representative, the answer might be systematically biased. Like System 1, these models don’t question their reasoning process; they just apply what they’ve internalized. Yes, you probably didn't notice that there is an extra \"THE\" in the sentence. System 2 (Conscious Mind) • Slower, More Focused Processing: In humans, System 2 is what we engage when we solve a math problem, plan a route without GPS, or consider the pros and cons of an important decision. In machines, System 2 analogs appear in processes that involve more explicit reasoning steps, such as search algorithms, symbolic reasoning engines, or “chainofthought” prompting in large language models. • Logical Inference and LongChain Reasoning: Consider a system that uses a knowledge graph or logical inference rules to solve a puzzle. Rather than instantly producing an answer from statistical associations, it methodically examines possibilities, applies logical constraints, and eventually arrives at a wellgrounded conclusion. This is a form of machinebased System 2 thinking—slower, more resourceintensive, but capable of handling complexity and ambiguity better than a fast patternrecognition system. • Explainability and StepbyStep Reasoning: One hallmark of human System 2 is that we can explain our reasoning—how we arrived at a conclusion. Certain AI approaches can similarly provide “rationales” or at least a reasoning trail. For instance, a planning algorithm that enumerates different paths before selecting one can show its steps. This makes it closer to System 2, as it “knows” the chain of decisions it took. • MetaCognition in Machines: Humans engage System 2 not only for complex tasks but also for monitoring and correcting System 1’s outputs. In AI, there are now techniques where a model’s quick answer (System 1) can be critiqued by another layer or component (System 2), which can verify, refine, or correct the initial guess. This metaprocess is reminiscent of how a person might catch a “gut feeling” error by calmly reasoning through details. You can't solve this maze with intuition, you need to think step by step. But your\"gut feeling\" quickly tells you that this image is not centered properly, you don't need a ruler for that... Thinking hurts if you do it correctly Humans handle complexity by using both systems. System 1 is like a muscle memory for your brain, giving quick judgments based on previous experience (like calculating 2x5). System 2 is a careful problemsolver, stepping in when precision and reasoning are needed (like calculating 14x17). That little pressure you feel in your brain when the calculation gets harder, is called Cognitive Load. Machines feel the same pressure, but on their CPUs and GPUs. Some algorithms rely purely on pattern recognition like System 1 and sometimes they do it messy but quickly (Like when you say \"Hey Siri\" and your friends phone answers instead. It works, but still has flaws), while others incorporate more deliberate, logical steps and planning like System 2 (for example when you ask AI how many \"R\" is in strawberry). You feel less cognitive load when you want to use the picture guid and that's why you prefer that method, even if the written one offers less steps. The more you use your System 1 in a task, The better result you get, and a better experience your System 2 gets, to later do the task faster but also better! ‍Same is true about the AI, The more training resources (data, computation power, and time) you provide, the faster and smarter model you have. That's why you see the models are getting smarter and unexpectedly cheaper as well. To wrap up this topic, I leave you with my favorite quote from Daniel Kahneman before we move on: \"Nothing in life is as important as you think it is, when you are thinking about it.\" Daniel Kahneman (Thinking, Fast and Slow) Acknowledging the World Our brain creates this experience which we call \"life\". This experience happens in a reality we call the \"world\", and we are continuously in interactions with it. We humans naturally group the world’s endless complexity into manageable chunks. We notice colors, shapes, and movements instantly, thanks to preattentive attributes and gestalt principles. These principles are the reason you can say one of the charts are random, two of them has some sort of meaning and the other one is definitely fake: Gestalt principles guide our visual perception and help explain how we effortlessly make sense of complex visual environments. The principles of Gestalt Theory have enhanced our understanding of human perception related to visual design and perceptual grouping. In the 1920s, Gestalt psychologists in Germany studied how people make sense of discrete visual elements by subconsciously organizing them into groups or patterns (System 1 in action again). The German word gestalt means \"shape or form.\" One of its founders, psychologist Kurt Koffka, described the Gestalt Theory as \"the whole is something else than the sum of its parts\" which means the unified whole takes on a different meaning than the individual parts. Pattern recognition is a fundamental capability that underlies both human intelligence and artificial intelligence. While humans excel at intuitive pattern recognition, machines approach it through systematic analysis of different things. We see faces in clouds and patterns in noise. AI, similarly, uses algorithms to detect patterns and anomalies. While we see a cat at a glance, a machine might see a grid of pixels. But by extracting features (edges, curves, textures), AI models learn to recognize objects just as reliably sometimes more so. Now they don't need Gestalt principles for that, it's just for us to describe how we naturally perceive and organize visual information, AI finds pattern in data with calculations... Simple Pattern Recognition in Machines to Classify Handwritten Numbers AI learns just like humans do When we talk about “learning” in the context of AI, especially in the analogy of System 1 and System 2, it’s worth dissecting what’s really happening behind the scenes. Machine learning does not learn in the human sense, where we integrate knowledge into a rich tapestry of experience and context. Instead, machines adjust parameters or manipulate symbolic representations to better perform a given task. But the process of learning is very similar to humans learning and this process has multiple steps. Before any learning can happen, a machine needs data. As humans, we automatically start forming internal representations of what we see or hear. A newborn baby doesn’t understand language yet, but by continuously receiving audio and visual input, they gradually discern patterns, like which sounds are associated with a parent’s face. For a machine, data could be a collection of images, text documents, sensor readings, or historical financial transactions. This raw data is analogous to the sensory stream the human infant receives. Without it, there’s no foundation from which the machine can learn. When a human infant encounters the world, there are no labels attached to objects. Before a parent ever says “This is a dog,” the infant’s brain is clustering shapes, sounds, and motions into rough categories. In the world of AI, we call that Unsupervised Learning, which means grouping similar data points without labels, like a baby noticing that round objects go together. For example, a clustering algorithm might discover that a batch of images naturally separates into groups: one group of round objects (balls), one of fourlegged animals (dogs), another of leafy shapes (trees). There’s no label “dog” here—just a recognition that certain patterns reoccur together. When a parent points to a dog and says, “Dog,” the human child connects the sound/word “dog” to that pattern of fur, four legs, tailwagging. The child refines their mental model: not only are these shapes and movements one category, but now they have a name. For machines, Supervised Learning is the same exact concept. An image of a dog comes with the label “dog.” The model uses these pairs (input, label) to incrementally adjust the algorithm so that it can predict “dog” for any similar image in the future. Over time, the machine builds a powerful mapping from visual features to the concept of “dog.” Without these labels, the model might understand groups of similar images but not what they represent. Now we all know that these are not the only ways we learn in life. We mostly learn from different experiences we have, and to be more specific, our mistakes. You can see a baby crawling towards the red shiny thing on the table, they reach out to touch it just to find out that it hurts their hands because it was a hot cup of tea. And that's how they learn they should not touch the things that steam coming out from them. Same happens when they say \"Mama\" or \"Dada\" for the first time and see their parents laughing so they decide to do it more and more. In the example that touching a hot mug is painful, there’s no label, just an action (touching) followed by a consequence (pain). Gradually, the child learns to avoid th behavior In the world of AI, we call this method Reinforcement Learning. An AI agent tries actions in an environment and receives rewards or penalties. Over time, it learns a policy, an internal mapping from states to actions that maximize longterm reward. Instead of associating images to labels, it associates situations and behaviors to outcomes. This is more similar to System 2 engagement because it often involves planning and foresight: to achieve a longterm goal, the agent might need to take a series of steps and reflect on consequences, much like a human might strategize several moves ahead in a board game. This is the typical approach we take to train our pets by giving them treats or punishments. Every time you Like, Save or comment under a post on instagram (or any other application that asks for your feedback) you are training the algorithm for that application, that's why as soon as you view a post for more than a few seconds, it starts appear in the recommended contents for you. In fact this is how OpenAI is training their models. It's called RLHF which stands for Reinforcement Learning from Human Feedback. Employees observe the answers AI generates for the given prompts and score the different options based on the answer being more humanlike. This is the method is being used to increase the safety in LLMs. Here are other examples which shows your own contribution in this process: When every you write the text you see in the image, or select the traffic lights in the picture for the \"reCaptcha\" you are actually training an AI for OCR systems (Optical Character Recognition which detect the text in the image) and Object Detection (Useful in smart phones and self driving vehicles). Some examples of training AI with the help of users, by labeling or evaluating the data. Now to optimize the learning in these gamified algorithms, we can again copy the Nature which has perfected the art of adaptation and survival since day one life has begun in the universe. Instead of relying only on trial and error for improvement, we can introduce the concepts of \"population\" and \"mutation\", the same principles that drive evolution in the natural world. By allowing a diverse group of solutions to compete, adapt, and evolve, we unlock a powerful way to discover creative and effective strategies, just as species evolve traits to thrive in their environments. This approach not only accelerates learning but also ensures robustness and innovation in problemsolving. This method in Machine Learning is (obviously) called Generic/Evolutionary Algorithms.‍ We can teach an AI system to perform better by utilizing the elements of the Evolution. Instead of training a single AI, we start with a diverse group (a Population) of different strategies or behaviors. We evaluate all of them to see which ones perform the task best or survives the environment longer (the Selection process), hen, we take the best strategies, mix them, and introduce small tweaks or randomness to explore new possibilities (aka Mutation). By repeating this process over and over, the algorithm refines the AI system (each try is called a Generation), just like evolution refines species in nature. Imagine you’re trying to teach a group of robots how to walk. At first, each robot tries different ways, of moving it's body parts, some end up crawling, some roll, and some try hopping. Most of them fail, at first but one or two manage to get further. Now, instead of starting over from scratch, you take the best performing robots and “combine” their strategies (which we call Gene). Maybe one robot learned to balance well, while another discovered the best rhythm to move the legs. By blending these ideas and introducing small random changes in their parameters, the next generation of robots starts with a head start. Over several generations, this process creates a group of robots that walk smoothly. By mimicking nature’s process of evolution (the Darwinian way), we introduce creativity, adaptability, and resilience into AI systems. This approach doesn’t just improve performance, it also makes AI better at handling unexpected challenges and finding innovative solutions thanks to the gamification methods in this algorithm. Training is only one of the uses, the other one is watching the process of the learning not because we need the trained robot, but we want to find what strategy (or Policy) caused the success. Whenever you see crowd simulations, it's actually the generic algorithms in the work. You can see a great and very fun example in this video: ‍ There are of course more learning methods in the the real world, for both humans and machines, but for the last one in this section we have one of the other most used methods. Humans can often learn a new skill more easily if it’s related to something they already know. For instance: if you’ve learned one language, picking up another is easier. Or when you go to medical school, you first learn about the fundamentals and then you will pick a major that you want to get specialist in. In AI terms, Transfer Learning allows a model trained on one task (e.g., recognizing text and ability to write one) to be repurposed for another (e.g., writing in your style or generate insights using your own knowledge base) with less data and time. The model’s current state of understanding of the data serve as a starting point, just as your knowledge of Spanish helps you tackle Portuguese with fewer lessons. Actually in \"ChatGPT\" the letter \"P\" is referring to this exact concept. GPT is short for Generative Pretrained Transformer, these models are trained on a large body of text data beforehand. This means it starts with a broad understanding of language, which can then be adapted (or “FineTuned”) to specific tasks or domains with relatively small amounts of additional data. This is actually the other reason I decided to write this blog post, so later I can fine tune the AI to mimic my own tone and style of writing better so I use it in the other posts I want to create. I think for now we have learned a lot about how machines learn, but just like humans, not everybody can graduate form the program. How can we estimate the quality of the model or say when they have gone through enough learning? Let's talk about the pitfalls of the learning process before we talk about how we can use the models for problem solving... Memorizing is Not Learning Memorizing and learning are fundamentally different, both for humans and AI. A model that memorizes its training data is like a student who learns the answers to test questions by heart but has no idea how to apply the concepts to a new problem. In machine learning, this is where the concepts of overfitting, underfitting, and generalization come into play. Overfitting often is like “overthinking.” The AI model stick to overly specific cues (For example every cat it saw in training was gray, so now it rejects non‐gray cats). It has effectively “memorized” some feature of the training data rather than learning broad, general characteristics of a cat. On the other hand, Underfitting often is like “not thinking hard enough.” The model hasn’t learned enough distinctions (for example it lumps all images with black dots as “muffin,” regardless of whether it’s a muffin, or a chihuahua). It’s too simplistic to differentiate the classes properly. In both cases similar errors happen, and that indicates our model is not learning anything. True learning lies in the sweet spot, where the model Generalizes. Instead of memorizing noise or oversimplifying, the model learns patterns that apply to new, unseen data. Humans do this intuitively. When distinguishing dogs from muffins, we don’t count blueberries or examine pixels, we look for meaningful attributes like fur texture, ear shape, or overall structure. This image shows how AI models can detect and prioritize meaningful features, such as ear shape or eye placement, to generalize effectively. This approach allows the model to perform well in realworld scenarios, avoiding both overfitting and underfitting. A poorly curated training dataset can cause both overfitting and underfitting. If the training set is too small or lacks diversity, the model may stick to specific, irrelevant features which leads to overfitting and the model performs exceptionally on that narrow slice of training data, but fails on anything that looks different. Conversely, if the training set is too broad but poorly labeled, missing crucial examples, or otherwise doesn’t highlight distinguishing features well, the model may never learn the nuance between classes which leads to underfitting, where the model is too simplistic and lumps everything together. You can see the same behavior when you ask ChatGPT to generate an image of a clock. No matter what time you ask for or how hard you try, it always shows 10:10. This happens because most of the clock images it learned from on the internet are set to 10:10 for marketing and aesthetic purposes, so the AI is biased toward that specific time and can't imagine the clock hands in any other position. So now we now It’s not just about the size of the dataset but also its quality, diversity, and labeling. We actually have a saying for these situations in Machine Learning: \"Garbage in, Garbage out!\" That's why we have to pay attention to the data we use for out model training, and we should consider what type of problem we are dealing with. We call this process Feature Engineering... Understanding the problem is half of the answer The best way of teaching this part is to start with few real life examples. Reallife examples help ground abstract concepts in familiar contexts, making them easier to understand and relate to. Imagine that we have found an old chest in the beach and we decide to open it and see whats inside. Good news is that we have found lots of coins, but bad news is that they are so old and rusty that we can’t say what coins are they. So we decide to sort them by the size just to find out how many types of coins we have found, it seems to be a good start. This trick is very similar to the unsupervised learning method, because we can’t say label them yet, we are just categorizing them. Once we finish with the grouping, we can see we have two groups of small and big coins. But it’s still very wage, there are lots of small and big coins, so we need another parameter to the coins and divide each group by their weights. Now we have two new group of coins, heavy ones and light ones.Now that we feel more confident in finding the type of coins using these parameters, it's time to introduce the labels. We bring out the coin collection we have and start measuring the same parameters which are size and weight. We have found 4 coins that are in the range of the coins we have found. So we bring these 4 clean coins to use them as labels for our rusty coins. by placing all of the coins (rusty and new one) on the same chart, we can see they are a perfect match, so we assume that the rusty ones should be the same exact coin as the new ones we added to their groups. This part of the solution is exactly what we see in Supervised Learning approach in Machine Learning. Congrats, you have just solved the mystery. Now in real life scenarios when we are dealing with raw data, it's more complicated. There might be more noise into the data, more candidates for the labels and having more parameters than two. But overall, this was a simple example of what we are doing in machine learning projects. We usually have two types of problems in Machine Learning: You either trying to predict discrete labels (like tagging the emails with \"spam\" or \"notspam\" labels) which we call Classification, or you are trying to predict continuous values (like estimating traffic) which we call Regression. What we did in the coins example was a simplified classification problem, We saw how adding new dimensions to our data helped us with the solution. Now lets see a simple Regression Example... Let’s say we are a farmer and we want to price our oranges for selling in the market. If we price it very low we might lose some profits and if we price it really high there is a chance that nobody buys it. So how can we find the optimal price for our oranges? First thing we can do is data gathering. We go to the market and buy one of every type of orange we find at the market and write down the prices. This helps us to figure out how other people are pricing their oranges. Once we have them all we start the same process we did with the coins, dividing them by their parameters. First we start by sorting them by their size, and since we want to find the price, we also sort them by their price. Then as you see it becomes very obvious that larger the oranges get, their price also rises. So there is a direct correlation between the size and the price. So once we draw a line that we saw in the chart, we can just add our orange in its place in the sorting, and we can find the price range using the other axis. Now these were extremely simple examples just to show the difference of nature in different problems. In reality, there are more variety of data you can encounter, and each of them are representing a type of scenario and requires a specific algorithm and methodology to solve. For example predicting the price of a house based on it's size, or predicting the traffic at a specific time are both in nature a Regression problem, but have very different data attribute. Or once you look at the data about foods based on the amount of sugar and fat they have, you will see that there are some correlations, but all you want to do is to label them as healthy or unhealthy which is a classification type of problem. We often encounter overfitting and underfitting in both regression and classification problems, especially when our datasets contain noise, which is a typical situation due to the imperfections in real world data collection. In regression models, noise can lead to overfitting, where the model captures the random fluctuations in the data rather than the underlying trend. This results in a model that performs well on training data but poorly on unseen data. In classification tasks, noise can cause the model to learn from random errors or irrelevant information, leading to overfitting. And once you try to avoid these situations, you might ran into some underfitting pitfalls. For example, While predicting whether a student will pass or fail based on the number of practice sessions and hours of study, a model that predicts a student will pass after completing a certain number of practice sessions or studying for a specific number of hours, without considering the combined effect of both factors is an underfitted model. This oversimplification may overlook students who need a balanced approach to succeed. On the other hand a model that thinks there is an exact formula linking the number of practice sessions and study hours to predict passing or failing, is an overfitted model. Such a model might capture noise or anomalies in the training data, leading to poor generalization to new students. A well trained model, recognizes both practice and study hours contribute to a student’s success, allowing for some flexibility in their combination. This approach captures the general trend without being too rigid or too simplistic. And as an example for an everyday problem of regression, let's imagine we are training a model for predicting the growth of a social media account’s followers over time with each post we make. A model that assumes a constant rate of follower growth over time, failing to capture the initial rapid increase and subsequent plateau commonly observed in social media growth patterns is underfitted. This could lead to inaccurate predictions, especially in the early and later stages. On the opposite end, a model that attempts to account for every fluctuation in follower count, such as temporary spikes due to viral content or drops due to unfollows is an overfitted model. This model becomes too complex and may not generalize well to future data. In this case, a well trained model, captures the typical trend of rapid initial growth followed by a gradual plateau, possibly using a logarithmic function. This approach balances complexity and simplicity, providing accurate predictions across different stages of account growth. Now the question is, how can we tell if our model is being trained correctly, or is going to fall for either overfitting or underfitting behaviour? What are the symptoms and how can we avoid them? Evaluation Metrics We always monitor our AI’s learning behaviors while training is in progress. Just like a teacher examines students’ performance on homework and quizzes to predict their final exam results, we evaluate the model’s errors during each training iteration (commonly referred to as an Epoch). By splitting the dataset into Training and Testing sets, we can assess the model’s ability to generalize and perform well on unseen data. In the early stages of training, it is natural to observe a high number of mistakes. As the model learns from the data, we aim to see a steady decline in both training and testing errors over time. This gradual improvement indicates the model is learning effectively. However, reallife training scenarios are rarely so simple. By closely monitoring the trends in errors, we can diagnose our models. This concept of Overfitting and Underfitting is something that we are trying to avoid in machine learning, but in different situations that it seems unavoidable, we might prefer falling towards one instead of the other one. Let's clarify this with an example: Imagine your email spam filter deciding whether an incoming message is junk. If you find a spam email in your inbox once in a while, it might be ok, but missing an important email just because you thought it's a spam, is unforgivable. So maybe changing this strict filter and use a more moderate setting for is not a bad idea. But let’s flip the scenario: consider a system that detects a serious health condition. If it’s too lenient, it could overlook patients who genuinely need urgent care. In contrast, if it’s too strict, it may send many healthy people for unnecessary, expensive, and stressful followup tests. However, while extra tests can be inconvenient, missing a sick person could be far more dangerous. That's exactly why we use Evaluation Metrics after the training process, even if the learning curve shows a healthy training process. Different scenarios demand different evaluation metrics, each tailored to the problem at hand. This is where concepts like \"Accuracy\", \"Precision\" and \"Recall\" become essential. To calculate these metrics correctly, we categorize predictions into four groups: True Positive (TP): The model correctly identifies spam as spam. True Negative (TN): The model correctly identifies legitimate emails as not spam. False Positive (FP): The model incorrectly flags a legitimate email as spam. False Negative (FN): The model misses a spam email. We have heard about these terms a lot during COVID19 Pandemic whenever a rumor was spreading on the news about different test kits and vaccines. Whenever a new home test kit comes to the market, we test it on different patients and gather all of the results and evaluate them by the actual status (using blood tests or CT scans) Now let's use the results of this example to calculate the metrics and judge the quality of the new kit. We start by analyzing the big picture first. Accuracy measures how often the model predicts the target and rise the flag correctly. It’s the proportion of correct predictions (true positives and true negatives) out of all predictions. Using the following formula we can calculate that our kit's accuracy: \\\\\\[ \\\\text{Accuracy} = \\\\frac{TP + TN}{\\\\text{All Samples}} = \\\\frac{9 + 29}{50} = 76\\\\% \\\\\\] While accuracy seems like a straightforward metric (and 76% doesn't sound too bad in this example),it can be misleading, especially in imbalanced datasets (where one class heavily outweighs the other). That’s why accuracy alone is not enough for evaluating the model’s performance. Precision comes next and measures how many of the predicted positives are truly positive. It indicates how well the model avoids false positives: \\\\\\[ \\\\text{Precision} = \\\\frac{TP}{TP + FP} = \\\\frac{9}{9+3} = 81\\\\% \\\\\\] A high Precision (like 81% in this example) means that when our model predicts a positive result, it is more often correct than not. However, a high Precision alone does not tell us whether the model is missing a lot of actual positives which is crucial for medical cases. Recall measures how many of the actual positives our model correctly identified. It’s a measure of how well the model avoids false negatives. \\\\\\[ \\\\text{Recall} = \\\\frac{TP}{TP + FN} = \\\\frac{9}{9 + 9} = 50\\\\% \\\\\\] As we see the recall in this kit is 50% which is not that high. If Recall is high, it means the model catches most of the positives. But a high Recall also bring an increased number of false positives. As you see there is always a tradeoff here, and it actually has a name, BiasVariance Tradeoff. ‍Bias refers to the model’s tendency to oversimplify. High bias (underfitting) means the model cannot capture the complexity of the data, leading to errors like assuming every email that contains word \"money\" is a spam. Variance refers to the model’s sensitivity to data. High variance (overfitting) makes the model overly focused on irrelevant details, basically like a overthinker. A highbias model misses the point entirely, while a highvariance model gets tangled in noise. The ideal model controls both bias and variance, striking a balance that allows it to generalize. Yes, judging the students performance according to their behaviour, or healthiness of a snack based on how much sugar and fat it contains seems understandable, but when we turn to cases like spam detection, how do we represent text data in a way that machines can understand? After all, these creatures only understand numbers, but can words be quantified? What about images? How do machines understand and digest the data in the first place? Alphabet of the Machines Now, up to this point, we’ve explored various examples where feature engineering played a key role in defining parameters to solve our problems. We saw how adding new parameters to our model makes it multi dimensional and easier to understand. We can explain the relationships and extract meanings from our data points, simply because they are some Vectors in multidimensional spaces (which we call Latent Space), and the machines are really good with math. When we talk about price of a house, it's size and number of rooms matters, and whenever the choices are still hard, we can always add more parameters to calculation. How old it is, in which neighborhood it is and so on. It's easy because numbers are easily comparable to each other, but what about the data that is not numeric in nature? Well yes, a neighborhood is not a numeric factor, but you can quantify it by some numeric feature it has like the rate of crime, or the proximity to landmarks. But what about words? Let's try the same method we used with the coins and oranges, with a group of words to see how far we can go with dimensions to quantify words... As we can see, machines understand words not as abstract concepts like us, but as points in a multidimensional space, where each dimension encodes specific attributes or relationships. But how do these dimensions come to exist? How does a machine ‘learn’ that the relationship between ‘King’ and ‘Queen’ is similar to that between ‘Man’ and ‘Woman’? This is where the magic of Natural Language Processing (NLP) begins... It''s f\\\\\\ing amazing that you have read this far, and I highly appreciate your time and curiosity if you really did! But, as I said in the beginning, I will talk about the rest (NLP, Neural Networks, Deep Learning, LLMs, xAI, AGI and Ethics) in future posts, Stay tuned, amnd check frequently. Thanks 🦾 ‍"
    },
    {
      "id": "blog:ai-changed-your-business-dubai",
      "type": "blog",
      "title": "What Did AI Change in SMEs?",
      "slug": "ai-changed-your-business-dubai",
      "url": "/blogs/ai-changed-your-business-dubai.html",
      "sourcePath": "blogs/digital-transformation-dubai.md",
      "tags": [
        "Digital Transformation",
        "AI",
        "Dubai"
      ],
      "description": "Why digital transformation in Dubai is now an architecture problem, and how agentic web platforms connect websites, CRM, chat, analytics, and lead intelligence.",
      "date": "2026-06-02T13:18:40.000Z",
      "text": "What Did AI Change in SMEs? Why digital transformation in Dubai is now an architecture problem, and how agentic web platforms connect websites, CRM, chat, analytics, and lead intelligence. What Did AI Change in SMEs? Who else remembers November 30, 2022? That was the day ChatGPT launched. And within a week, every LinkedIn post, every agency deck, and every dinner conversation had the word \"AI\" in it. Within a month, half the world was terrified of losing their jobs. Within six months, the other half was selling AI courses. But do you see how it's different now? You can actually see the daily changes day by day. We are not just talking about Artificial Intelligence anymore, it is now a part of our daily lives, just like the internet did in the 90s, and we are expecting similar scenarios to happen again. The Three Years Nobody Likes to Talk About Let's be honest about what actually happened between late 2022 and today. The first wave was pure panic. \"AI will replace your job\" was the headline. Graphic designers, writers, developers, customer service teams, all of them bracing for the apocalypse. Companies rushed to add \"AIpowered\" to their marketing materials before they even understood what that meant. The term \"Artificial Intelligence\" became the most misused phrase in business history, attached to everything from a simple dropdown menu to actual machine learning models. Then came the backlash. \"AI is just autocomplete.\" \"ChatGPT makes things up.\" \"My team tried it for a week and went back to how we worked before.\" The whole narrative shifted in the other direction, and a lot of businesses decided to sit this one out and wait for the \"mature\" version. Here is the problem with that: the mature version arrived, and they missed it. By 2025, the businesses that kept experimenting quietly built real advantages. They figured out what AI is actually good at. They stopped using it as a party trick and started building it into their infrastructure. The ones who waited are now trying to catch up to competitors who have a year or two head start on something that compounds. That is where we are in 2026. The dust has settled. The hype is done. What is left are systems that actually work, and a growing gap between businesses that have them and businesses that do not. Digital transformation is not a technology purchase. It is an architectural decision. It means rethinking how information flows through your business, how your customer touchpoints connect to each other, and how decisions get made. A website is a customer touchpoint. So is a CRM. So is a chatbot. So is a WhatsApp message from your sales team. Digital transformation is what happens when all of those touchpoints, including your website, share the same data layer, speak to each other in real time, and feed insight back to your team automatically. When that is working, your business operates as one intelligent system rather than a collection of tools that technically exist in the same company. Most businesses in Dubai have the tools. They do not have the architecture. They have a website on WordPress, or other nocode platforms, which won't give you enough flexibility that code would. A CRM that sales updates manually. A WhatsApp Business account that nobody monitors consistently. An analytics dashboard that someone checks once a month. They are surrounded by data and still making decisions blindly because none of it connects. The shift to an agentic, connected system is what we mean when we say digital transformation. Not a new logo. Not a redesigned homepage with AI buzzwords on it. A new operating model for how your business handles every visitor, every lead, and every conversion opportunity. Why Dubai Is Where This Matters Most Dubai is not a normal market. International buyers research Dubai property from London at 11pm on a weeknight. They are comparing four agencies simultaneously. They will engage with whichever one responds fastest and most intelligently to their behavior. If your website cannot tell the difference between that buyer and someone who clicked your ad by accident, you have already lost. Patients fly from Europe, Russia, and across the GCC for cosmetic procedures and specialist medical care. They spend weeks researching online before they ever contact a clinic. When they are finally ready to book, they will send the same enquiry to three or four clinics and wait to see who responds with the most confident, relevant, immediate answer. The first clinic that does wins the booking. Financial services clients, whether they are looking for mortgage advice, investment guidance, or business setup, make trust decisions online before they ever speak to a human. They look at your content, use your calculators, and judge your credibility from your website alone. If that experience is passive and generic, the trust is never built. In each of these industries, the website is not just a marketing tool. It is the first and most important commercial interaction your business has with every potential client. What it does in that window, and how intelligently it does it, determines the outcome. The gap between these two journeys is not a design decision. It is an architecture decision. What We Build at ShapeShifters We do not build websites. We build agentic web platforms. The distinction matters, and it comes down to one architectural principle: everything runs on the same data layer. At ShapeShifters, that layer is Sanity CMS. The website reads from it. The chatbot reads from it. The lead scoring system writes to it. The SEO engine pulls from it. The content automation publishes into it. When a property listing changes, the chatbot knows immediately. When a visitor interacts with the chatbot, that interaction feeds back into their behavioral profile. These six capabilities are not sold as a fixed bundle. They are building blocks, combined at whatever depth the business actually needs. A medical clinic might need only a chatbot and lead intelligence. An offplan developer might need all six running at full depth. The platform adapts to the business, not the other way around. The Gap Is Closing Fast The businesses winning in Dubai right now are not the ones who added AI to their website. They are the ones who let AI become the website. Three years ago that distinction was academic. Today it is the difference between a sales team that makes warm calls with full buyer context and a sales team that coldcalls a list of names with no information. It is the difference between a clinic that wins the international patient and a clinic that sends the same generic email response everyone else sends. The technology is no longer experimental. The case studies are real. The ROI is measurable and it shows up in the first month. If your business is in real estate, medical, or financial services in Dubai, and your website is still waiting for visitors to fill out a form, you are not watching the future arrive. You are already behind it. The conversation about what to do about that starts here. Or if you want to see the full case studies first, they are at shapeshifters.dev/casestudies."
    },
    {
      "id": "blog:ai-is-a-tool",
      "type": "blog",
      "title": "Why There Will Be No AI Apocalypse?",
      "slug": "ai-is-a-tool",
      "url": "/blogs/ai-is-a-tool.html",
      "sourcePath": "blogs/ai-is-a-tool.md",
      "tags": [
        "AI",
        "Technology",
        "Society"
      ],
      "description": "A grounded argument for treating AI as a tool, not an apocalypse, with a tour through human progress, limits, and alignment concerns.",
      "date": "2026-06-02T13:18:40.000Z",
      "text": "Why There Will Be No AI Apocalypse? A grounded argument for treating AI as a tool, not an apocalypse, with a tour through human progress, limits, and alignment concerns. Why There Will Be No AI Apocalypse? | Behnam Khorsandian I remember when they told us in school that the only thing that separates us from animals, was the intelligence, and i remember being the smartest species on earth was a huge flex for us and made the whole humanity to think that we own the planet. Well, we have managed to create something that changes everything. We alone created languages, solved complex problems, wrote literature, made art, and built skyscrapers, satellites, and the internet. However in the last few decades, this exclusivity has started to fade. Machines, once limited to printing “Hello World” on a black and white screen, now ignores your request with a simple \"Sorry, i can not assist with that\". Today, artificial intelligence (AI) can compose text that can pass as our own to your teacher, drive cars more reliably than some humans, and detect diseases from medical images with higher accuracy than many doctors. In fact, once you try to join forces with AI, the results get worse, which means doctors should leave the AI alone and let it do its job, LITERALLY (You can read the whole story and the published paper here). So what is going to happen next? Are we their slaves? Are they going to eat us? No. Every time someone resist using AI for some task, i vividly remember my geometry class in school, whenever someone tried to draw a line by hand, the teacher would say: We all have reached to this point, survived the nature and enjoyed the comfort of the couch in front of the TV, just because we used our tools, and you thought a ruler is too much for you... I would be lying if I say I didn't try to start this blog with ChatGPT, but at the end, it just couldn't do it as i like and that shows AI can't replace humans that easily (yet). This is one of the very reasons i decided to started this series about AI (I will talk about the rest of them later), to show you AI is not a threat and it's simply just another tool and we should not worry about it (i will talk about the things we should worry about later). The main reason behind this behaviour of us, is Anthropomorphism, which mean Human Bias. We naturally give things personality and intentions. We yell at our computer asking \"where are my files?\" when it's our responsibility to save them, or talk lovingly to plants. We also project this humanlike thinking onto AI and machines. AI tools don’t get angry, bitter or jealous. They don’t secretly cooking evil plans, they just follow code, math, and probabilities. Your Siri isn't passiveaggressive, maybe you mumbled. Netflix isn't judging your binge habits, it's just math, patterns, and clicks. Knowing this human tendency should help us keep our sanity regarding AI paranoia. A brief review of humanity breakthroughs Look at it this way: every breakthrough in our history wasn’t just an invention, it was a revolution. Before we get wrapped up in debates about AI, let me refresh your memory about how we got here. From Caves to Stars It all started with Fire. Not just a way to stay warm or roast a meal, but the spark that lit up the night and ignited our collective imagination. That first flame was a bold statement: we wouldn’t be held captive by darkness. Next came Writing. By putting thoughts into symbols, we made ideas permanent. Suddenly, knowledge wasn’t fleeting—it could be passed down, built upon, and shared across generations. Or the invention of Wheel. it was our first step of movement and change. This simple, yet important creation introduced us into new world of transportation and trade, connecting distant communities and made it possible to discover the planet. With the beginning of the Industrial Revolution, mechanization and mass production reshaped every aspect of life. Factories created, cities expanded, and for the first time, technology began to work at a scale that redefined human productivity and social organization. Discovering Electricity was like capturing lightning in a bottle. A force that powered homes, industries, and a newfound era of communication. The era of Aviation broke the chains of gravity, giving us wings to fly and shrinking our big world into a connected global village. Then arrived the digital age, started by the invention of Computers. These machines turned abstract calculations into realworld solutions, made us dare to dream even bigger with Space Exploration. Going beyond our planet wasn’t just about science, it was a declaration that human curiosity knows no bounds. And of course, the concept of the Internet was perhaps one of the most profound moments in modern history. This global network dissolved geographical barriers, fostering an era of instantaneous communication, collaborative innovation, and shared human experience. Now, Just imagine for a moment, a world where we had allowed fear and doubt to held us from each of these breakthroughs. What if the Wright brothers had give in to the terror of defying gravity, thinking human flight too dangerous to pursue? Or when the camera was invented, some painters feared it would make their skills obsolete. Yet, instead of smashing cameras in protest, artists embraced photography, leading to new art forms and perspectives. Even the invention of the telephone faced skepticism, with concerns it would disrupt personal communication. But, it became an indispensable tool, enhancing connectivity and fostering relationships across distances once thought insurmountable. Without the courage to explore, our society might have remained trapped in a perpetual state of “what ifs.” The creative, collaborative spirit that defines our past, and fuels our future, would have been lost in a maze of hesitation and missed opportunity. Throughout history, every significant technological leap has been met with fear. However, it’s our willingness to overcome these fears and integrate new tools into our lives that moves humanity forward. AI is no different, it’s just another tool created to augment our capabilities, not to limit them. Developing AI, like the innovations before it, can lead to unprecedented growth and opportunities. Actually, AI has always been here! AI history is not something new, it dates back to ancient myths of artificial beings, like the Greek automaton by Hero of Alexandria, and philosophical ideas from Aristotle and Descartes. Even the modern concept of AI (what we think of when we hear the word) began in the 1940s with the invention of the programmable digital computer, a machine based on abstract mathematical reasoning, which inspired discussions about electronic brains. In the 1950s and 1960s, AI saw early successes like programs playing checkers and chess, and proving mathematical theorems. However, the first \"AI winter\" in the 1970s occurred due to unmet expectations, leading to reduced funding. But again, the 1980s brought expert systems back, and the 1990s saw neural networks revive. By the 2000s, machine learning and big data drove breakthroughs in image recognition and natural language processing. Today, AI is everywhere, and in fact, it has always been. Since we talked about most of the human's major milestones, it's only fair to also cover some of the contributions of AI. In healthcare and medicine, AI is nothing short of a lifesaver. Behind the scenes, advanced algorithms goes through countless medical images and patient records to spot early signs of diseases, often faster and more accurately than the human eye (as i mentioned in the beginning of the post). AI driven radiology tools generate precise, instant reports, personalized medicine and treatments to individual needs, opening up opportunities for breakthroughs in drug discovery and patient care. AI model can predict breast cancer up to 5 years in advance. While you might not even notice it, AI is quietly making our life easier every single day. Every time you find a new music on spotify that matches perfectly to your taste, get instant weather and traffic updates, or redirect junk emails to your spam folder, you’re benefiting from Artificial Intelligence. While these features may seem small to you, life would suck without them, and these are not even the nobel applications of AI. The most noble uses of AI are those addressing global challenges, particularly for vulnerable populations, aligning with SDGs (Sustainable Development Goals). These include healthcare improvements, humanitarian aid, education access, environmental conservation, social justice, and accessibility for disabilities, UNHCR's refugee support, and conservation international biodiversity tracking. These applications cover high moral principles, promising a better future for us and the our planet. LLMs (Large Language Models), like ChatGPT, which i believe is the reason we care about AI nowadays, have evolved from simple text generators into smart assistants that help draft emails, brainstorm ideas, and even provide realtime problem solving. Since their creation (or at least since they became viral) they have created opportunities for businesses to streamline operations and for consumers to enjoy a smarter, more connected lifestyle. But along with these opportunities, AI has created some fear inside our minds as well. Common fears expressed on social media include job displacement, loss of human decision making and control, and concerns about AGI (Artificial General Intelligence) risks like potential bioterror threats. Most people worry about privacy issues, data misuse and lack of transparency in AI decision making. These fears are the source for major conspiracy theories range from claims about AI being government surveillance tools to more extreme theories about alien technology involvement. However, these lack concrete evidence and often overlook AI's documented development history. So, why there will be no AI apocalypse? While it's natural to have concerns about AI, given the fears and conspiracy theories on social media, it's important to recognize that AI is still in its early stages, and many risks are manageable. History shows that technology, like the internet, initially faced similar fears but has become integral to our lives with proper safeguards. Look at us right now for example, it's been only 2 years (as im writing this in 2024), but it's really hard to remember how did we do most of the daily tasks without AI. So maybe it's time to start thinking differently about this tool. Now, as i have promised in the title of the post, lets see why i say there will be no AI apocalypse and that AGI (while it's waiting at the corner and i'm sure it would be as powerful as they say) will NEVER be conscious. I claim this statement \"there will be no AI apocalypse\" based on few key observations: Only way to know, is to do Ever heard of Conway's Game of Life? It's very simple, you start by drawing simple patterns of cells on a grid. From there, the simulation evolves. These are the only rules: cells survive with 23 neighbors, die from isolation or overcrowding (less than 2 neighbours or more than 3 neighbours), and dead cells come back to life if exactly 3 neighbors exist. These rules cause some patterns vanish entirely, others stabilize, and some just keep transforming unexpectedly forever and growing endlessly. But here's the point, there's no shortcut (no fast forward or rewind technique) to predict exactly how a given pattern behaves, or what was the input that created this result. To know if your initial pattern survive forever or stop after a while, you can't just look at a it and say it looks eternal, no, you literally have to let it run step by step until you proven wrong or you ran out of time. You can give it a try right here (but also search it in google to see some cool effects). Try drawing a pattern with your name that you think it never vanishes or freeze, and while you are watching it unfolds, think how can you ever be sure that you have the correct answer? AI, running on pure logic and math, ran into the same issue. It can't predict outcomes in certain complex situations without actually crunching through each and every step. That's exactly why an AI sometimes generates a whole page of text, only to stop at the last sentence saying, “Sorry, I can't assist with that.” It genuinely can't tell earlier, because it doesn't know until it knows. There's no shortcut, no backtest, and definitely no cheat codes to skip ahead. Gödel's Incompleteness Theorem is all about the same dilemma in logic and math: There will always be statements that are true but can't be proven by being clever or taking shortcuts. ‍ ‍ Monkey see, Monkey do As we mentioned before, ChatGPT and other LLMs are not the only AIs we use, but since it uses prompting instead of programming, it became more popular, and since it generates almost everything like text, image and sound, it became a little bit concerning. Wittgenstein shares a slightly different opinion in his Tractatus LogicoPhilosophicus (don't worry, it's much easier to understand than it's to pronounce) Language isn't just about words and symbols, it's deeply rooted in shared human experiences and context. He basically claims that without lived experiences, emotions, and cultural backgrounds, words don't truly mean anything, they’re empty shells, it's the people who give meanings to them, just like how you can write these 🍑 🍆 emojis in a message and send to both your partner and the local grocery store and they think about two totally different things (hopefully). AI can generate convincing text, memes, and joke for you, but they not laughing, cringing, or feeling anything about things you say or they generate themselves. This limitation means there's ALWAYS going to be an authenticity barrier for AI generated communication, it can mimic our language like a pattern, but never fully understands it because it needs lived experience. To understand the impact of this limitation on AI, let's talk about a famous thought experiment called Chinese Room. Imagine you are alone in a room, with just one book of instructions. There is a small gap under the door, and once in a while a piece of paper will be given to you, and you have to look for an answer from the book that matches the incoming message and send it back. But here is the problem, everything is in chinese language, you have no idea about the messages and the response you are writing, you are just sure that is the correct response because you picked the matching symbols to write based on the instruction book you had (which looks something like the code below) AI experience the same thing, this experiment demonstrates that there's a big difference between AI simulating understanding (by processing symbols in ways it was trained) and actually experiencing genuine comprehension or consciousness. You can say the say thing about Chimps in Number Memorization test, or Neuralink Mind Pong experiment. They don't care about math, or understand the underlying concept of numbers or counting. They don't even enjoy playing pong on a computer with their friends like we do, they just want the banana. ‍ Loaded questions, provide loaded answers Say you're writing an essay and your chosen topic is climate change. So you ask your AI, \"Tell me about climate change,\" and it generates a paragraph. It sounds okay at first, but it's pretty vague. You're thinking you want more details, so you say, \"Give me more details.\" And what does AI give you? Another single paragraph. A longer, denser paragraph stuffed with facts, effects, icebergs melting, polar bears stranded, sealevel rise, weather patterns shifting, it's like compressing an entire movie into a 30 second clip. And of course, it either skips some good stuff or just can't fit everything you had in mind. So what's really happening here? You see, AI isn't exactly psychic, the real issue is this tricky thing called the Frame Problem, where an AI struggles to figure out exactly what details matter from the million options available. And even if you try to help AI by explicitly providing every tiny detail, clearly stating what matters or doesn't matter you run right into another issue named Complexity Brake Theory, this theory suggests that piling up details and becoming too specific just confuses your AI, it doesn't know where to start or end, struggling under the weight of too much information. It's a bit like that classic scenario: if someone says, 'Whatever you do, don't think about an elephant\" what's the first thing you picture? The elephant. AI suffers from the same flaw, except way worse. It focuses on the details you're trying the most to avoid or ignore, precisely because it doesn't understand context or intent without explicit, careful instructions. The bottom line here is simple, AI can't read your mind and it can't spontaneously decide what's relevant for you. AI doesn't just wake up one day deciding to rule the world or plotting anything at all. AI literally cannot initiate anything on its own. Everything an AI does depends entirely 100% on human input. And humans aren't great at clearly explaining things either. We're vague. We're impatient. Sometimes our instructions contradict each other. Honestly, if humans could perfectly articulate every detail clearly, would we even have built AI? Probably not. Turns out, the machine isn't necessarily the weak link, it's often us! Take your tinfoil hats off In my personal opinion, the only place of discussion that truly worths our time, are the debates that center around ethical and security issues of AI. This topic also includes concerns about AI safety and the rate of capability growth versus its alignment, which is a discussion that i have plans to cover in future posts. Despite these concerns, i should emphasize that current AI technology, while powerful, has some serious limitations and is far from the superintelligent scenarios that fuel these fears. AI researchers call this issue the alignment problem, essentially, how we can design AI to want the same outcomes as us rather than some random destructive behavior. I will address these REAL concerns in another post because it demands a dedicated discussion, because turns out this problem is kinda hard to solve. But, glass half full, it also means the first potentially \"dangerous\" AI (if any) is likely to be hilariously incompetent at its evil tasks, or at least very easy to spot because of the reasons we have discussed. Even the evil AI needs a system update from his human developers, isn't that stupid? Let’s relax, we'll put safeguards in place long before it hits that stage (if ever). And after all, when it comes to pulling the plug, humans remain champions of quick thinking. Finally, i believe if we understand how these things work, what are the meanings of the buzzwords we hear in the news and social media, and learn about the process behind this magical tool, we will be more interested in using it. So before you panic and start planning for the AI apocalypse, read my blog posts to realize how similar are we to these digital creatures, and how easy we can use them to improve our lives. And don't forget, if you you learn how to use it, you will also learn how to defeat it. Thanks for reading."
    },
    {
      "id": "cookbook:infinit-plus",
      "type": "cookbook",
      "title": "Infinit Plus",
      "slug": "infinit-plus",
      "url": "/cookbooks/infinit-plus/",
      "sourcePath": "cookbooks/infinit-plus/cookbook.json",
      "tags": [
        "Longevity",
        "Biomarkers",
        "Research"
      ],
      "description": "A research on predicting biological age from NHANES blood panels, with parts of the actual blog post for context.",
      "date": "2026-06-04T14:16:58.000Z",
      "text": "Infinit Plus A research on predicting biological age from NHANES blood panels, with parts of the actual blog post for context. Longevity, Biomarkers, Research Saved notebook export nhanes_biological_age.html"
    },
    {
      "id": "project:aped-studio",
      "type": "project",
      "title": "APED Studio: Interactive Sculptures for VR Installations",
      "slug": "aped-studio",
      "url": "/projects/aped-studio.html",
      "sourcePath": "projects/aped_studio.md",
      "tags": [
        "VR",
        "Robotics",
        "Installation"
      ],
      "description": "Interactive sculptures, robotics, VR installation systems, and technical production for APED Studio exhibitions.",
      "date": "2026-06-02T13:18:40.000Z",
      "text": "APED Studio: Interactive Sculptures for VR Installations Interactive sculptures, robotics, VR installation systems, and technical production for APED Studio exhibitions. APED Studio: Interactive Sculptures for VR Installations | Behnam Khorsandian I work with APED Studio, where we try to present our work through values we care about: harmony, balance, and proportion. We borrow from older mediums like painting, music, sculpture, and combine them with newer tools like VR, Animatronics, and installation to tell the story better. We treat our subjects in a nonanecdotal, emotionally neutral way. We try to avoid trends and rigid beliefs and let the story and the aesthetics decide. Over the years we’ve shown our installations in digital and new media festivals, first in Iran at TADAEX 8 (2018) and then in Poland at SURVIVAL 17 (2019). We also collaborated with escape rooms, designing automated missions and puzzles using Raspberry Pi, Arduino, and Kinect. ‍ My role Inside the team, I took care of the technical side. My job was to give life to the interactive sculptures we design, usually by building small robots and control systems that make the work responsive, making models and parts ready and then making them with 3D printing and Laser Cuts, and taking care of technical setups and installations. I also propose alternative technical paths when the concept needs a different angle, and I plan production so the installation can be shipped, installed, and maintained abroad without surprises when team were abroad. ‍ The project: “Immanuel” “Immanuel” is an immersive posthuman installation / VR experience. It follows a layered, stepbystep path: read a graphic novel, enter an abandoned laboratory, run experiments, watch a video, then meet the eponymous Immanuel in VR. Here is a clip we made in one of our exhibitions: The environment invites the viewer to move gradually into a future where research objects try to live separate—yet inevitably short—lives. Meanwhile, humanity loops through its own conflicts, fears, and resentments. In the VR flashback, you see the room as it used to be, and you watch Immanuel being born, and dying, again and again. Here are some photos from Survival17 exhibition: Since The TADAEX Page is not available anymore, here is a 360 of the our installation (VR Room), plus some photos we could find in our hard drives:"
    },
    {
      "id": "project:shapeshifters-intel",
      "type": "project",
      "title": "Building a Lead Intelligence Engine on the Edge",
      "slug": "shapeshifters-intel",
      "url": "/projects/shapeshifters-intel.html",
      "sourcePath": "projects/shapeshifters_intel.md",
      "tags": [
        "Cloudflare Workers",
        "Lead Intelligence",
        "Edge AI"
      ],
      "description": "An edge-native lead intelligence package for tracking, scoring, SEO analysis, and DigitalTwin gateway exports.",
      "date": "2026-06-02T13:18:40.000Z",
      "text": "Building a Lead Intelligence Engine on the Edge An edge-native lead intelligence package for tracking, scoring, SEO analysis, and DigitalTwin gateway exports. Building a Lead Intelligence Engine on the Edge | Behnam Khorsandian At ShapeShifters, the promise we make to clients is that their website will know who is ready to buy before the sales team picks up the phone. That promise runs on a package called @shapeshiftersdev/intel. This is the project log for that package. It covers how the tracking layer works, how we score behavioral data, the architecture decisions behind rulebased vs ML scoring, and where the system is headed next. What the Package Does @ss/intel is a unified intelligence layer installed on every client website we build. It handles four things: Tracking — capturing visitor behavior from the first pageview, anonymously, before any form is filled Scoring — turning that behavioral history into a single number that tells the sales team who to call SEO — analyzing Sanity CMS documents and generating meta, alt text, schema markup, and keyword clusters Gateway — a status and export API that our DigitalTwin master system reads from to monitor all client sites from one place The package ships as a set of subpath exports: @ss/intel/tracking, @ss/intel/scoring, @ss/intel/seo, @ss/intel/handlers, @ss/intel/plugin, @ss/intel/gateway, @ss/intel/d1. Each one is independent. A client might install only tracking and scoring. Another gets the full stack. The runtime target is Cloudflare Workers. Everything runs on the edge. No Node.js servers, no cold starts, no regionlatency surprises for clients in Dubai, London, or anywhere else their buyers are coming from. The Data Model Three tables. That is the whole schema. Stored in Cloudflare D1, which is SQLite running on the edge. No external database to manage, no connection pooling to worry about, no latency hop to a separate region. The D1 binding lives in the Worker and queries run in the same datacenter as the request. Every row in events belongs to a leadid. That ID starts as an anonymous string generated in the browser. It becomes a real identity the moment the visitor identifies themselves, which i will get to in the merge flow section. The Tracking Layer The tracking layer is a React context provider called LeadProvider. It wraps the Next.js app and exposes a useTracking() hook that any component can call. The events we capture: | Event | Points | ||| | pageview | 1 | | scrolldepth | 1 | | timeonpage | 2 | | click | 2 | | formstart | 5 | | formsubmit | 15 | | pricingview | 10 | | download | 10 | | returnvisit | 3 | | videoplay | 4 | | videocomplete | 8 | | demorequest | 25 | | contact | 20 | | signup | 30 | Each event fires a POST to /api/intel/events, which writes a row to D1 and triggers a score recalculation for that lead. Session management works off a 30minute inactivity timeout. Mouse movement and keyboard events reset the timer. When a session times out and the visitor becomes active again, a new session ID is generated. Pages visited get tracked per session, so we can see the journey, not just the individual actions. The Consent Gate Before anything hits D1, we check consent. The consent state lives in localStorage under intelconsent. If analytics consent has not been granted, events are buffered in localStorage instead of being sent to the server. Once the visitor accepts, the buffer flushes. This is also the first place the anonymous ID appears. A visitor who has not consented still gets an anon prefixed ID stored in localStorage. Their events queue up locally. If they accept consent, everything flushes. If they convert without ever accepting, the merge endpoint still works because the anonymous ID travels with the identification request and all queued events are sent along with it. Anonymous to Identified: The Merge Flow This is the part that makes lead scoring actually useful, because most of the valuable behavioral data happens before a visitor fills out a form. The flow: Visitor lands on the site. LeadProvider generates an anonymous ID: anonabc123 Every event that fires gets stored: either in D1 if consent is granted, or in localStorage if not Visitor submits a contact form. The form component calls identify(\"realleadid\", { email, name }) identify() fires a POST to /api/intel/identify with the anonymous ID, the real lead ID, and any queued localStorage events The server creates the identified lead in D1, inserts any queued events under the real ID, then runs UPDATE events SET leadid = ? WHERE leadid = ? and UPDATE sessions SET leadid = ? to move all existing D1 records over The anonymous lead record is deleted From that point on, the lead profile is complete. The sales team sees the full behavioral history, including everything the visitor did before they ever identified themselves. The Scoring Engine The score is not a sum. It is a weighted sum with exponential time decay. $$score = \\sum{i} pointsi \\times e^{\\lambda \\times daysi}$$ Where: $pointsi$ is the base point value for event type $i$ (from the table above) $daysi$ is the number of days since that event occurred $\\lambda$ is the decay constant, defaulting to 0.05 With $\\lambda = 0.05$, an event from 14 days ago retains about 50% of its original point value. An event from 90 days ago retains about 1%. Events older than 90 days are excluded entirely. The practical effect: a lead who visited the pricing page yesterday scores higher than a lead who visited it three weeks ago, even if their total event history looks similar. Recency is signal. Staleness is noise. Scores are recalculated on every new event. The score column on the leads table is always current. The hot threshold is score 50, which triggers the red highlight in the Sanity Studio plugin and the realtime alert to the sales team. The point map is configurable per client. A real estate developer might weight pricingview at 20 instead of 10 because in their market, anyone looking at pricing is already serious. A medical clinic might weight videocomplete higher because patients who watch the full procedure explainer are more qualified than those who skim it. This is the rulebased layer, and it's where the domain knowledge lives. RuleBased vs ML: The Architecture Decision Every system we ship starts rulebased. The point map above is the rule set. It encodes what we know about the industry: which actions predict conversion, and how much weight each one carries. This is the right starting point for two reasons. First, a new client has no conversion history we can learn from. An ML model trained on fifty conversions is not a model, it's noise with a label. The rulebased point map, informed by our experience across the vertical, is more accurate at the start than any model trained on thin data would be. Second, the sales team needs to trust the system. When they see a score of 85, they should be able to understand why. With rules, you can explain it: this person viewed pricing twice, submitted a form, and came back the next day. The rule set is auditable. A model's decision boundary is not. The graduation point comes when two conditions are met: the client has enough real conversion data to learn from, and the rule set has grown complex enough that maintaining it is becoming its own job. At that point, we start building a model that learns the conversion patterns directly from the data, rather than from our assumptions about what the data should say. The thing that changed this calculus recently is LLMs. A young system that does not yet have enough real conversions to train a model can now generate synthetic behavioral histories that match the business context. Not a replacement for real conversion data, but a way to give a model something to start from before the pipeline fills in with real signal. The architecture supports this graduation. The ScoringConfig interface accepts a custom pointMap that overrides the defaults. The same scoring endpoint that runs rulebased weights today can run modelderived weights tomorrow without any structural change. The Sanity Studio Plugin Every client site we build uses Sanity as the CMS. The @ss/intel/plugin export adds two tools directly into the Sanity Studio UI: a Lead Intelligence dashboard and an SEO Analyzer. The Lead Intelligence tool shows: Total leads and hot lead count (score 50) Total event count across all leads A full leads table with score, last seen, and event count A lead detail view with the complete event timeline for any selected lead The SEO Analyzer runs documentlevel audits across six dimensions: title and meta, content quality, link health, structured data, readability, and mobile signals. Each dimension produces a score and a set of actionable suggestions that editors can act on without leaving the CMS. Both tools talk to the same /api/intel/ endpoints that power the frontend tracking. One API, two consumers. The DigitalTwin Gateway Our internal system, DigitalTwin, is a master application that reads status and exports data from every client site we manage. The @ss/intel/gateway subpath exposes three endpoints that DigitalTwin polls: GET /api/intel/status — returns lead counts, hot lead count, new leads this week, SEO score averages, and content freshness GET /api/intel/export — paginated export of leads, events, or SEO analysis POST /api/intel/update — allows DigitalTwin to push configuration updates back to the client site The gateway endpoints are authenticated with a shared secret. DigitalTwin holds the key. Client sites verify it on every request. This architecture means we can see the health of every client system from one place, without logging into each one individually. When a client's hot lead count drops to zero for three days in a row, DigitalTwin flags it. That is a signal we need to look at their tracking setup, or their traffic, before the client notices it themselves. What is Next The current scoring engine is rulebased with configurable weights. The next layer is a model that learns the weights from conversion data rather than having them set by hand. The infrastructure for this is already in place: the event history is clean, the conversion events are labelled, and the scoring engine accepts external weight maps. The other piece in progress is the scanner. @ss/intel/scanner reads // @ss tags from component files and generates a blueprint of which sections of a site are instrumented for lead tracking, SEO, and chat. This makes it possible to audit a site's instrumentation coverage from the outside, and to autogenerate Sanity schemas and GROQ queries for new sections without writing them by hand. The goal across all of it is the same: the moment your sales team picks up the phone, they should already know who they are calling and why."
    },
    {
      "id": "project:golead",
      "type": "project",
      "title": "GoLEAD: Flexible AI Platform for Higher Education",
      "slug": "golead",
      "url": "/projects/golead.html",
      "sourcePath": "projects/golead.md",
      "tags": [
        "EdTech",
        "Generative AI",
        "Analytics"
      ],
      "description": "Solution architecture and AI systems for GoPrep, an adaptive courseware platform for higher education.",
      "date": "2026-06-02T13:18:40.000Z",
      "text": "GoLEAD: Flexible AI Platform for Higher Education Solution architecture and AI systems for GoPrep, an adaptive courseware platform for higher education. GoLEAD: Flexible AI Platform for Higher Education | Behnam Khorsandian At GoLead, I work as a Solution Architect, designing and implementing AIdriven learning solutions for the EdTech and L&D sectors. I initially joined as an AI Engineer, but my role quickly expanded into solution architecture and AI consulting, where I now bridge technical strategy with product innovation. Our product, GoPrep (used to be called GoPlus before rebranding) is an innovative platform focused on enhancing learning experience for higher education through cuttingedge technologies like Machine Learning and Generative AI. This platform supports educators and learners by offering adaptive assessments, custom content authoring, and datadriven insights to optimize educational experiences. ‍ In this Project Log I will explain: What Is our product, and What our AI does (User Journey) How we analyze the data we have collected (Case Study) How did I build the AI service (Solution Architecture) So let's talk a little about the platform before going in depth with my experience and contributions... ‍ What is GoPrep? Who is Zacky? GoPrep is GoLEAD’s AI‑powered courseware platform. It helps instructors design courses, assess learning, and act on analytics, while students get personalized study plans and an engaging learning experience. To explain how we made that possible, I should break down the main features we offer in this product, and I will explain how our AI, Zacky, will help the stakeholders in every step which is the main part of my work in the Generative AI section... ‍ Study Plan & Learning GoPrep’s Study Plan is where a course takes shape and the materials become learnable. Instructors organize chapters and learning objectives, then compose materials with a notion style, block editor, where you can include text, images, video, voice overs, and embedded tools like calculators and spreadsheets. This flexible learning module makes the plan and the content live together. On the Student side, each objective flows the same way every time: first learn, then practice, then a quick EarnPlus (It's the gamification we have, that makes you first earn some amount of points before moving forward) to self evaluate. It’s responsive across devices and we have the plan to make it possible for live teaching too, with Jam Sessions that let you present the exact materials you authored in an interactive class setting. ‍ Our AI will help you with creating the course blueprint using your materials, and writing the blocks, or directly extract them from the resources you provide. ‍ ‍ Assessments Assessments sit on that foundation and come in four formats: Quiz, Test, Exam, and the EarnPlus we talked about. Instructors can run them online with customizable policies. We enabled the option of question pooling groups items by concept and difficulty and can randomize selection to keep things fair. We support 10 different types of questions for these assessment: True or False All that applies Multiple Choice Ranking Matching Fill the Blanks Long Answer (Essay) Numeric Questions (Supports Mathematical Notations and Single Numeric Answer) Tabular Questions (Using Excel like Spreadsheets) Hotspot (Finding answer in an image) ‍ Our AI, can Design Questions from Learning Materials, Extract them directly from your resources (PDF or CSV), and Generate Similar Questions from the parent questions you will add to the Question bank, while maintaining the same scope and level of difficulty. ‍ ‍ Ebook You have the option to add an Ebook to your course, as the reference for students, so they can read them, highlight parts and add notes to it. Our AI let's Instructors to design the course and materials directly from this book, reference the learning material to different parts of the book, and also let's the Students to interact and ask questions from the book using our chatbot, and even ask Zacky to add practice assessment for them. Our AI supports almost all formats of files that you might have, including: PDF, EPUB, DOC/DOCX, Markdown, RTF/TXT, and HTML ‍ Analytics Analytics closes the loop by turning activity into decisions. Dashboards connect questions, assignments, and progress so you can see performance at a glance or drill into the details. We analyze Parent questions and their AI‑generated Descendants to understand concept‑level mastery, track participation and outcomes as assessments run, and surface trends in gradebooks and student result pages. This includes engagement time and learning‑objective performance. The goal is to make “what’s happening” and “what to do next” equally obvious for instructors and learners. ‍ Our AI comes to play in two different ways for analytics: 1 As a popup to help you understand the numbers and the charts you see on the screen, using buttons on each chart. 2 As an assistant that you can ask to dig deeper into the data, and show you the knowledge gaps, most performing students and many more functionalities in a form of chatbot ‍ ‍ ‍ What do we do with the Data behind the scenes? All I have explained so far, was the Generative AI part, that mostly handles the customer front features. But my job doesn't end here, it actually begins. The goal is creating more effective learning experiences by analyzing patterns in student data and using these insights to implement targeted interventions and improvements to the platform. I ran data studies to inform “Zacky Insights” and the adaptive engine. Here, I will share a brief about one of our studies after we completed a pilot we had with King Saud University (KSU), that answers three critical questions: ‍ How our AI helped the student/instructors? How repeated practice (“Earn Plus”) translates to test performance. How study timing relates to outcomes. ‍ In this Pilot, the total number of students was 700, and overall, they spent almost 39,000 hours on the platform (that's about 55 hours per student) only on the assignments. This high level of engagement is because of the high number of questions we had in the database, which was not possible without our AI. The total number of questions created by the instructor himself was only about 200, and we generated on average 10 similar questions for each using AI, so we can state that we have saved more than 90% of the instructor's time by automating that part using our AI. This metric gets even better in the current project we have with another university, because they don't even need to manually create questions, they just upload a PDF, and our AI extracts the questions, adds them to the database, and creates similar questions automatically. And that's only our contribution, to enhance the experience for Instructors. Regarding the improvement of the quality of learning, which shows how Students benefit from our AI, I did some analysis on how repeated practice (which we call Earn Plus) impacts students accuracy and time management in their assignments (including final exams), and have observed some positive correlations with test scores, meaning better scores on homework and study plan tasks are generally associated with higher test scores. Now, as you see It's not a perfect linear distribution, which suggests that other factors (such as the time they spent in other parts of the platform, like learning and ebook) might also significantly influence exam performance. So i started analyzing the relationship between students’ study behaviors and their academic outcomes (final exam score). The goal was to highlight the importance of when, how often, and for how long they engage in learning activities. Once we compare the best and worst students in exams, we can see differences in their study patterns: Here how top students spent time in the platform: And this is how low performers spent their times: And once we put it all together, everything becomes crystal clear: Green boxes: times when the best‑performing students were active. Red boxes: times when the worst‑performing students were active. ‍ We can clearly see, during afternoon hours (12 PM–6 PM), best‑performing students were more active, especially on weekdays, suggesting higher productivity or focus. Conversely, late night/early morning (like 4–6 AM on Thursday) shows more activity for worst‑performing students, implying fatigue, cramming, or last‑minute attempts. (For more accurate analysis I exclude Test and Quiz activities from the activity windows to avoid tautological patterns). ‍ Why that matters to us? By examining patterns in study sessions alongside performance results, valuable insights can be uncovered to guide more effective learning strategies. The aim is to understand how timing, frequency, and duration of study contribute to success, and then translate these findings into practical tools such as personalized reminders and tailored learning plans. This involves collecting and analyzing data on student activities and applying statistical methods to reveal meaningful correlations and potential causal links, ultimately informing approaches that foster improved performance. ‍ The Solution Architecture Once I joined the team, there was an attempt to enable the AI in the platform, but it wasn't working properly. The whole Generative AI (LLM) was a new trend, and there was not analytics, only data visualizations. So I decided to take the responsibility of the AI part compeletly, and this is how, and why I built it: ‍ From Monolith to Microservice GoPrep was a classic Django monolith: one repo, one runtime, one database. It was productive early on, but as soon as we pushed hard on AI features the system started to feel like a physics problem. The amount of computation required by question generation, document processing, and retrieval made the entire stack grow heavy, for example CPU and memory‑intensive jobs were competing with everyday traffic. I proposed turning Zacky (our AI layer) into a separate service so we could scale it on its own axis, isolate failures, and choose technology that fit the problem instead of forcing AI workloads through a courseware frame. The second pressure was velocity. Different teams were shipping fast, but their work landed in the same process space and deploy pipeline, so features stepped on each other. Having Zacky as its own service let us decouple development and release cadences. I could iterate on agents, prompts, and async pipelines without worrying that a long‑running extraction task would slow down a live exam or a nightly report. The clean boundary also made contracts explicit: GoPrep calls Zacky over APIs, Zacky guarantees response shapes and SLAs. Finally, we wanted a business path beyond the platform itself. By carving Zacky out, we could monetize it as a SaaS for other EdTech providers likeLMSs and courseware tools that have content and users, but no AI. That meant thinking like a platform from day one: multi‑tenancy, metering, per‑tenant isolation, and an integration surface that’s not tied to GoPrep’s internal models or UI assumptions. ‍ Tech Stack GoPrep is built on Django with PostgreSQL, a solid, batteries‑included web framework and a reliable relational store. For courseware flows (auth, enrollments, assignments, grading, schedules), Django’s conventions are productive and Postgres is great for transactional integrity and analytics‑friendly SQL. But for Zacky I needed a different shape. I chose FastAPI because Zacky’s job is to expose clean, high‑performance endpoints (often streaming) rather than render templates or manage admin views. FastAPI’s async‑first model, type‑hinted contracts, and OpenAPI generation made it simple to stand up multiple access patterns (REST, gRPC, WebSocket) with low overhead, and to evolve the API quickly as the agent design matured. On storage, We kept Postgres for the monolith and introduced MongoDB for Zacky. Early AI platforms live with changing schemas: new tool calls, evolving prompt configs, variant question shapes, and per‑tenant overrides. Mongo’s document model let me store these without schema thrash, handle unstructured artifacts (intermediate tool outputs, trace logs), and scale horizontally by service or tenant. It also made it easy to split resources, so heavy tenants can live on their own cluster while lighter tenants share. For retrieval, I paired this with a vector store (ChromaDB) so the system can embed documents and search semantically. The result is a clean separation: Postgres for durable courseware records, and Mongo + Chroma for Zacky’s flexible, AI‑native data. Operationally, Redis + Celery handle long‑running work: PDF extraction, text chunking, vector embedding, bulk question generation. Offloading these to workers prevents request threads from blocking and lets us scale workers elastically during ingestion spikes. For real‑time interactions (chat, streaming generation), FastAPI’s WebSocket endpoints keep latency low while the heavy lifting happens asynchronously in the background. ‍ The Architecture (The high‑level system) The instructor journey starts simple: someone wants to create a course but they don’t have content or questions yet. They upload PDFs or other files; we index them, store raw files, and pass the text through a content assessment step that can understand multiple formats. Those artifacts flow into a vector database so that later, when a user asks for a particular learning objective, the system performs semantic search to pull the most relevant passages. From there, Zacky switches to creation mode. Using the retrieved snippets as grounding, it drafts learning materials, tagged by objective, and linked back to the source. In the same pass or as a follow‑up, Zacky generates assessment items aligned with the content, again referencing the source text for explainability. With content and questions in hand, the course material assembles into modules; instructors can review, tweak, and then roll the materials into a Study Plan. On the learner side, each module presents training materials and quick checks; if a student needs more practice, the system can spin up parallel variants until they reach mastery. Once students begin interacting, telemetry feeds back into analytics: performance by objective, time on task, participation patterns, and question‑level outcomes including the lineage from a Parent question to its Descendants (AI‑generated variants). Those signals drive adaptive learning and instructor dashboards, and they also flow back into Zacky to refine generation and recommend the next best action. The loop is intentional: ingest → generate → learn → measure → adapt. ‍ The Agentic Core (how Zacky works) Under the hood, Zacky runs an agent system that coordinates retrieval, generation, and chat. The Main Agent owns the conversation and decides which tools to invoke. A Query Agent specializes in retrieval, crafting search queries, calling the vector store, and returning grounded context. The Main Agent then chooses whether to draft content, generate questions, answer a student, or call back out for more context. Long‑running steps such as PDF extraction and embedding are queued to workers so the UI can stream progress and partial results. A typical request flows through FastAPI’s routers into services, which either route to the agent system (for generation and chat) or enqueue a task for the async layer. The agent consults the vector store via the Query Agent when it needs grounding, then produces structured outputs—validated against Pydantic models, so GoPrep can consume the results without glue code. Throughout, logs and traces are written per tenant, and rate limits/auth tokens are enforced at the edge so Zacky can be offered safely as a service. ‍ Results Separating Zacky gave us crisp boundaries and room to grow. We could autoscale ingestion workers without touching exam traffic, throttle and meter per tenant, and ship AI features on their own release train. More importantly, it made the business modular: GoPrep remains a strong courseware product, and Zacky becomes a platform that any learning system can adopt. The result is a cleaner architecture for us and a clearer story for partners, use what you need, where you need it, without inheriting the rest of our stack."
    },
    {
      "id": "project:infinit-plus",
      "type": "project",
      "title": "Infinit Plus: Predicting Biological Age with a Blood Test",
      "slug": "infinit-plus",
      "url": "/projects/infinit-plus.html",
      "sourcePath": "projects/infinit_plus.md",
      "tags": [
        "Machine Learning",
        "Health Data",
        "Research"
      ],
      "description": "A machine learning research project for estimating biological age from biomarker data and clinical constraints.",
      "date": "2026-06-02T13:18:40.000Z",
      "text": "Infinit Plus: Predicting Biological Age with a Blood Test A machine learning research project for estimating biological age from biomarker data and clinical constraints. Infinit Plus: Predicting Biological Age with a Blood Test | Behnam Khorsandian NHANES.ipynb I had the chance to work on a small project for an industry I really liked, Longevity, and in this post I will share the details of that journey. I got referred to a doctor who worked in this field, by our CEO, to help him take this research project off the ground, with the potential of it turning into a product later. He had 15 years of bio markers data (Blood tests, weight, heights and other measures) and he needed a predictive model to calculate the biological age of the client using these bio markers. He had some consideration, for example only study the people in range of 18 to 45 and narrow the biomarkers and only use “Glucose”, “Cholesterol”, “Lymphocyte”, “Corpuscular”, “Height” and “Weight” for the ease of process. In my mind it was a very straight forward task, a simple regression model using supervised learning techniques, That’s it. So I asked for the data and that’s when the challenges surfaced… ‍ Data Collection First thing I do in data related projects, is a health check of the dataset, to make sure we have everything we need and its in a good shape to work with. This is the CSV file I have received: Right at the first sight, I noticed the challenges I had with this dataset: Column names were encoded. There are a LOT of missing data. So I reached the doctor and told him about the health of the data, and he said he actually is aware of that, gave me the mapping for the column names, and explained that the process of data collection have changed in each cycle, so not everybody have their blood tested similarly. So in total we had about 69% missing glucose, 41% cholesterol, 27% lymphocyte and corpuscular volume, and very high gaps in insulin (91%) and testosterone (81%). Since he said he has assembled the CSV file himself, I got worried if there was a mistake while importing the data, so I decided to ask for the data source so I can collect the datapoints with code to be sure about the quality of it. The source of the data was National Health and Nutrition Examination Survey (NHANES), that measures the health and nutrition of adults and children in the United States. NHANES is the only national health survey that includes health exams and laboratory tests for participants of all ages, so I was a bit relieved that we have a huge dataset in hand and there are lots of research and documents about it. This actually allowed me to use other biomarkers that what has initially requested. I gathered the data cycle by cycle, merged them into a single dataframe and renamed the columns to the actual biomarker, and this was what I got: Good news was that I had much cleaner data with less gaps, but the sample size was much less (and reduced to third after filtering to the age range), the “Testosterone” data was not found in NHANES bank, and we still had 50% missing data on “Insulin”. ‍ Data Cleaning I quickly checked the correlations to see how much the insulin is related to the age, to decide how much should I invest in cleaning it. Judging by the heatmap you see, it didn’t answer my question, because there were so few strong correlation with most of the parameters: So I decided to just drop the missing rows (which gave me only 2200 samples to work with) and start training to see what are we dealing with. I wanted to see how much is the importance of insulin in predicting the age. I used XGBoost Regressor with a TreeBased approach and the results was awful: Which means the model is off by 7.27 years on average, and cherry on top, the importance of Insulin turned up to be very high, even more than the corpuscular which had the highest correlation with the age: So I decided to not drop the missing values, and instead try and fill the gaps first, this way I could use all of the data (almost doubled the sample size) I didn’t use the normal practices, which basically use the Mean or Median of the values, because these biomarkers are going to be absolutely different from one person to another depending on their age and other parameters. So I chose to go with the Iterative approach, which models each feature with missing values as a function of other features, and iteratively predicts missing values. This is the result I got with the cleaned dataset (with 5170 samples): Which means the model is off by 6.99 years on average, which is better than the previous try, but still not very good. So I thought I can do better, instead of the Iterative approach, I will use an unsupervised approach with KNN Imputer. In this method, for each missing value, we find the knearest samples (rows) using the other features, and impute the missing value as the mean (or weighted mean) of those neighbors. But Surprisingly, the results got even worse than the first try: which means the model is off by 7.60 years on average!!! The issue is that although this approach is simple, intuitive, and very Fast for small to medium datasets, It may not capture complex feature relationships. ‍ OK, Wait a minute... While I was thinking about the ways I could reduce the error, it suddenly hit me: In this dataset, the age column (which I’m trying to predict) is the chronological age of the individuals, and the goal is to calculate their biological age. So should I considered the models prediction as the biological age? If so, then what even the error means? If my dilemma is not clear to you yet, let me put it this way: What if the age that is calculated by the model is the actual biological age I’m looking for and I’m ruining it by trying to minimize the error?! I was thinking about this for two days straight, and I was completely lost. So I decided to double check, I shared my codes and results with AI, explained the project and asked to evaluate my work and see if is there anything wrong with it. It just gave me normal recommendations like adding more biomarkers to the study, acknowledge the high rate of missing values, and encouraged me to use cross validation and optimization for better results. But it didn’t catch the actual circular trap I was in. Even after I asked the question and concern I had, it just said “Your model gives you a proxy for biological age, with about ±7 years of precision. Refining that precision (by adding more biomarkers, reducing missingness, stacking models, etc.) shrinks the “noise band” so you can detect smaller biological‐age deviations with confidence.” So I decided to actually explain the logical fallacy I was worried about (which BTW we have a cool term for it: petitio principii or also known as begging the question) and then it finally got my point, and confirmed: “You’re not crazy—your intuition is spot on. By training a model to mimic chronological age, you risk smoothing over the very biological variation you care about. But if you treat the model’s prediction as biological age and its error as meaningful deviation, you can still extract useful insights.” (Duh) ‍ Syncing up with the SME I tried to do a sanity check before I move forward, and share the progress and the results I had so far with the doctor, to see if I’m on the right track or not. I decided to extract a SHAP chart (which is a way to visualize feature contributions to a model’s predictions) from the best model I had, so he can tell me if the model's use of parameters makes sense to begin with or not. In this chart, each dot represents an individual. The xaxis shows the SHAP value, which is the impact of each feature on the model’s prediction. Red dots indicate higher biomarker values, while blue dots indicate lower values. For example, higher cholesterol values push the prediction toward an older biological age (to the right), while lower values push it younger (to the left). The same effect is seen for glucose. Insulin shows the opposite: higher values push older age, while lower values push younger age. For corpuscular volume, both red and blue dots appear on both sides, suggesting its effect is more individualspecific and contextdependent. Features lower on the chart generally have less overall importance on the prediction compared to those higher up. Clustering of dots near zero is natural and reflects individuals with average biomarker values that don’t strongly shift the prediction either way (like the gender, which makes perfect sense). After these explanations, He confirmed that the model is broadly aligned with medical knowledge, so I got relieved that I'm on the right track. ‍ So What was the issue? Once I shared the concern I had with the doctor, he simply confirmed that this is a very well known issue and actually is a profound dilemma at the heart of biomarkerbased aging models. See, biological age is meant to capture the physiological wear and tear, which means how “old” a person’s body truly is. Our model will capture average age related changes, which might not fully translate into health risk. That’s fine for calibration, but it can wash out the very biological differences we care about. Also for us, the useful quantity was the acceleration (how far someone sits above/below the expected age for their peers), not the raw number. As we discussed this matter, we reached the conclusion that for better predictions, we had three options, but each came with their own challanges: Including more modern data, such as DNA and genetic information, which wasn’t easily available, and also we wanted to have a model that predicts the acceleration of the aging, and gives recommendations to the users to slow down the aging, with a simple blood test. Adding behavioral metrics like diet, lifestyle habits, and mortality measures. But we avoided these in the first place because we wanted to rely only on blood tests, not on selfreported data that could be inaccurate. Using a larger and richer dataset (such as the UK Biobank), which not only has more data but also includes repeated measurements from the same patients across different cycles (which becomes super helpful because it shows how biomarkers change as they grow old). The challenge with this option was that access required taking courses and passing exams (the doctor was already in the process), working within their lab environment (which leaved me out), and dealing with possible biases (for example, patients might only visit the Biobank when they are already unwell, which could skew the data.) Since the doctor mentioned he is trying to gain access to this dataset, but it will take time, i decided to also do a research on my own to see what else is out there and how other people are doing it. ‍ Research and Literature Review When I stepped back from our prototype and dug through the literature, I realized there are three generations of biologicalage models, and I only built the first one so far... ‍ Gen1 (Where I Started) A straightforward age prediction model from routine labs that learns to mirror chronological age. It’s useful to shake out data issues and for basic calibration, but the right way to read it is via Age‑Acceleration (predicted minus expected at the same chronological age), NOT the raw “Biological Age” number. So chasing lower MSE alone can flatten the biological variation we actually want to see. ‍ Gen2 (The Problem Solver) It's not like my current model was useless, it was just not the thing we wanted. Because training a model to match your birthday will mostly tell you… your birthday! But in this second generation, the Klemera–Doubal Method (KDM) assumes there’s a real but hidden “biological age” underneath, and it asks a fairer question: given your age, how far do your lab values sit from what’s typical for someone like you? Get it? It's like thinking backwards (Which amazingly powerful in most of situations): Instead of predicting your age with biomarkers, I predict the biomarkers given your age (which in theory they should always be around a fixed number). That breaks the circular trap and lets genuine differences show up. Let me explain: Think of your lab panel like a dashboard. As people get older, each gauge (albumin, glucose, WBC,...) tends to drift in a predictable direction, but some gauges are noisy and overlap with others. KDM tackles this problem by fitting age on each biomarker separately and then combining those estimates with inverse‑variance weighting. I know, it's a mouthful, but it's going to be crystal clear if you know about the three steps it takes to solve the problem: 1 Draw the baseline for each gauge. For every biomarker, KDM learns the average line of how that marker changes with age in the population. 2 See how you deviate. It checks whether your value is higher or lower than what’s typical for your age. Bigger, clearer deviations count more, tiny or noisy blips count less. 3 Blend the gauges fairly. It combines all those deviations with smart weights (stable markers get more impact, noisy or redundant ones get less) and adds a small nudge toward your actual age so the result isn’t jumpy. The outcome is a single number: KDM Age. It indicates how old your biology looks compared to peers. ‍ Why this is amazing? 1 It doesn’t chase your chronological age as a training target. 2 It handles overlapping biology without double‑counting. 3 It’s stable enough to track over time, so trends (improving/worsening) are meaningful. ‍ So as soon as I found out about this, I got back to coding again. I grabbed the KDM formula from the paper, which looked like this: \\\\\\[ \\\\text{KDM\\\\\\BA}\\i = \\\\frac{\\\\dfrac{A\\i}{S\\A^2} + \\\\sum\\j \\\\dfrac{b\\j\\\\,(x\\{ij}a\\j)}{\\\\sigma\\j^2}} {\\\\dfrac{1}{S\\A^2} + \\\\sum\\j \\\\dfrac{b\\j^{2}}{\\\\sigma\\j^2}} \\\\\\] And then created a function and applied it to the whole dataset (because it should have other samples to compare, it doesn't work on a single entry). Now I had a new column in the dataframe, called KDMBA Which i could use for training, instead of the Age column. But before that, I had to make sure that I did the calculations correctly. So I did a quick health check, by calculating and plotting the KDM Residual (the difference between KDM and Actual age). What I was looking for was signs of Healthy or Unhealthy models: Healthy Model Signs: Residuals are centered around zero: No systematic over or underestimation. No strong trend with age: The spread of residuals should look similar across all ages. Some spread is normal: Outliers are expected, but most values should cluster near zero. Signs of Calculation Error: All residuals positive/negative: Likely a bug in KDM\\BA calculation. Strong trend with age: Indicates model bias or a calculation issue. Extremely large/small values for most samples: Possible scaling or implementation error. and here are the results: I was relieved once again, because that couldn't look more healthy. Just like the actual age across a population, the KDM is also distributed normally. Now I restarted the training, and this time the value i was trying to predict was this KDM. And here are the results: which means the model is now only off by 5.83 years on average. That's 30% improvement which is very nice. And that's not the only improvement, now with more accurate model, I can see the actual effect of each biomarker in the aging, and their importance. Here is the updated SHAP chart: As we can see, Cholesterol stays the top driver in both, but in the new model, BMI moves up near the top, while insulin drops from a top driver to midpack, consistent with insulin’s effect being confounded once glucose/BMI are modeled better. Glucose remains strongly influential. Since I had a more accurate model in hand, I decided to also create a chart for each biomarker healthy range: Now for a better usage, I'm trying to develop a simple application, where you can input your blood tests results, and it will calculate you Biological age using the model i have trained, and gives you some insights about how each of your biomarkers are contributing to your age acceleration (either making it better or worse). What's Next Steps? (Gen3) As you remember, i told you that there are 3 waves of attempt toward calculating biological age, and the third generation is by calculating PhenoAge. It flips the target from “age” to risk. It predicts mortality risk from a compact lab panel and maps that risk back onto an age scale, yielding PhenoAge and PhenoAge Acceleration that carry clinical meaning. It does require careful harmonization and, ideally, CRP, where CRP is missing we start with a PhenoAge lite and plan to add full CRP as the pipeline expands. ‍ Improvements Needed: As you can see in the biomarker healthy range plot, there are some features with outliers (glucose, insulin, alkaline phosphatase, and creatinine) that needs to be eliminated so we get more accurate reading and training. Also I haven't used any gridsearch or cross validation for the model training which might enhance the accuracy, so that's one other thing i can do after i create a proper pipeline for training. But Before moving any further, I have to wait for more and cleaner data, because the KDM relies on the population, the more sample data you have, the more accurate model you get. So as soon as i get access to UK Biobank data, or find a good way to clean the NHANES dataset, i will update this post. Currently I'm only studying on 5,000 individuals, there are 113,000 samples in the dataset that i can utilize after cleaning it, which means I have to use the accurate model to first fill out the gaps of the missing biomarkers, then use the cleaned version to train a bigger model."
    },
    {
      "id": "project:neo-quest",
      "type": "project",
      "title": "Neo Quest: Blockchain Powered AI Art Competition",
      "slug": "neo-quest",
      "url": "/projects/neo-quest.html",
      "sourcePath": "projects/neo_quest.md",
      "tags": [
        "Computer Vision",
        "Blockchain",
        "AI Art"
      ],
      "description": "An AI art competition concept with image similarity scoring, computer vision experiments, and blockchain mechanics.",
      "date": "2026-06-02T13:18:40.000Z",
      "text": "Neo Quest: Blockchain Powered AI Art Competition An AI art competition concept with image similarity scoring, computer vision experiments, and blockchain mechanics. Neo Quest: Blockchain Powered AI Art Competition | Behnam Khorsandian Image Comparison (Image Processing).ipynb In 2022 at the very beginning of the whole AI buzz(that started with AI image generators) when artists were thinking AI is going to take their job, I worked on an interesting project for an AI art competition concept where participants compete to recreate a target image using only AI prompts. The idea was simple: If you think it's easy, buy a ticket and join the competition. The goal was to show artists it's not that easy to recreate what's in your mind only using your words. Well NOW it's 2025 and thats not valid anymore, but in my defense, back then we only had Stable Diffusion and early version of Midjourney which you could only use via Discord, and the quality wasn't that good. Here is the very first image i generated using Midjourney: Occupied Mars The platform had daily competitions, and it was based on Blockchain (so everything was transparent and safe) and users needed to buy a ticket, and had 1 hour and 100 prompts to recreate the target image. The winner is determined by whose generated image is most similar to the target and the prize was the 80% of the collected tickets. In this post I will share how I approached the image similarity scoring system (The blockchain part wasn't that big of a deal to explain) ‍ ‍ The Problem: The core challenge was simple: given one target image per competition, how do you accurately rank candidate submissions? This might sound straightforward at first, but as I dug deeper, I realized the classical approaches I had used before were fundamentally flawed for this task. I chose the Girl with Pearl Earring as the target image, and found different meme versions of it online to use as the attempts to test the system A young woman wearing a yellow dress with white collar and a blue and yellow turban and a large pearl earring. ‍ How I Did It initially (Classic Image Processing) In my first attempts at image comparison, I threw everything at the problem. I combined dozens of handcrafted metrics from different libraries and image processing techniques. The idea was to calculate all these metrics between the target and candidate images, normalize them somehow, and combine them with some weights to produce a final similarity score. I will try to briefly explain the methods I have tried and show the results for these two images: ‍ ‍ Global Histogram Comparisons and Pixel Statistics This method compares the overall color or intensity distribution of two images by analyzing their histograms and basic pixel statistics, providing a simple measure of similarity. It has two sides: Histogram Difference: Compares color or intensity histograms Flattened Correlation: Correlation of flattened grayscale pixel arrays. Histogram difference is global and ignores pixel order and correlation checks linear relationship between pixel values. To find the image similarity using histogram difference, we have to compare the frequency distributions of pixel intensities or colors between two images, typically by computing the sum of absolute differences across histogram bins. Lower histogram difference values indicate greater similarity in overall color or intensity distribution. Then we convert both images to grayscale, flattening them into onedimensional arrays, and calculating the Pearson correlation coefficient between these arrays. A correlation value closer to 1 signifies high similarity in pixel intensity patterns, even if spatial positions differ. That's why it wasn't enough, it just tells you if you have used same colors with the same brightness and contrast or not. I needed a method that considers the structure and the \"Actual\" similarity... ‍ Structural and Perceptual Metrics These metrics refer to a family of image similarity measures that go beyond simple pixelwise comparisons by modeling how humans perceive visual quality and fidelity. They evaluate aspects like luminance (brightness), contrast, structure (spatial relationships), and information preservation, aiming to align with human visual perception instead of just mathematical differences. For these metric i used SEWAR library, which offers the following metrics: SSIM (Structural Similarity Index): Measures luminance, contrast, and structure similarity (closer to human perception) VIFP (Visual Information Fidelity): Measures information preserved in the image. They might have some uses in detecting similarities, but it's only on lowlevel attributes such as luminance, contrast, and structure, overlooking semantic content, contextual meaning, and highlevel features. This can be problematic, two images might score highly similar if they share structural elements despite depicting completely different subjects or scenes, or conversely, images of the same object under varying conditions might appear dissimilar. Moreover, these metrics can be tricked by adversarial perturbations that subtly alter pixel values to preserve perceptual scores while changing the image's interpretation, or by transformations like rotations and scalings that maintain statistics but disrupt spatial relationships beyond their evaluation scope. So I needed something that detects similar features and detail of the the photos... ‍ Keypoint Matching These techniques offer a significant advancement over structural and perceptual metrics by focusing on local features that are invariant to transformations like rotation, scaling, and affine changes, thereby addressing the limitations of metrics that can be fooled by spatial rearrangements or adversarial perturbations. These methods detect and match distinctive points based on gradients or descriptors, providing a more robust assessment of similarity that considers semantic content and objectlevel correspondences rather than just lowlevel statistics, making them better suited for tasks requiring invariance to viewpoint changes. These are the main methods i have tried: ORB (Oriented FAST and Rotated BRIEF): Fast binary descriptor for keypoint matching. SIFT (ScaleInvariant Feature Transform): Robust, scale and rotationinvariant keypoint detector. Then I used FLANN Algorithm to match keypoints between images, and here are the results: ‍ ‍ To measure similarity using keypoint matching, first we need to detect keypoints and compute descriptors (I tried ORB and SIFT) for both images. Then, match descriptors using a matcher like FLANN which stands fo Fast Library for Approximate Nearest Neighbors (We could also try Brute Force Matching), often applying a ratio test to filter good matches. Similarity is quantified by the number of good matches relative to the total keypoints or a minimum threshold (for example, if the match ratio exceeds 0.10.2, the images are considered similar, indicating shared features despite transformations). I know it looks really cool, but it still has flaws, including sensitivity to occlusions, lowtexture regions where few keypoints are detected, and potential mismatches in cluttered scenes. Also, it may not capture global context or subtle differences in color and illumination, and computational complexity can be high for realtime applications. These methods are used for similarity, but the usage is different, it's more useful to detect clones, not differences, so i decided to completely move to another technique that matches and compares the entire image and highlights the differences... ‍ Pixel wise Error Metrics These metrics compute direct differences between corresponding pixels. It was not abad idea at first, because in applications requiring speed and simplicity, such as realtime quality checks or when images are prealigned and transformations are minimal, pixel wise metrics provide fast, interpretable results without the computational overhead of feature detection and matching. They are also more straightforward to implement and debug, making them suitable for baseline comparisons or when semantic invariance is not needed. These are the main metrics i tried in this family: MSE (Mean Squared Error): Average squared difference per pixel (lower is better) RMSE (Root Mean Squared Error): Square root of MSE (interpretable in original pixel units) PSNR (Peak SignaltoNoise Ratio): Ratio of max signal to noise (higher is better) ‍ Again, it looks promising, because it clearly highlights the parts of the image that are different, so it gives you an error to subtract from the full score, but there is a HUGE issue, they are highly sensitive to spatial misalignments, lighting variations, and transformations like rotation or scaling. Means they failing when images are not perfectly aligned or have undergone geometric changes, leading to inflated error values even for visually similar images. ‍ Why Classical Computer Vision was not working? I tried different combinations of these techniques, even created an aggregated system that gives a weight to each metric and calculates the average score, but none of my attempts were good enough. Each metric has different ranges and statistical meanings. For example MSE could be in thousands while the other one is between 0 and 1. I tried to normalize them with arbitrary constants (like subtracting from 1e16) which completely broke interpretability and made the system unstable. Also I was computing 10 different metrics, many of which were highly correlated. This added computation cost but very little new information. The weighted average at the end was just guessing, because there was no principled way to set the weights. Each new target image or dataset would require manual retuning. The final score had no clear interpretation. I was convinced that this approach is not working, and I can't do this with the lazy way.The most telling sign that this approach was wrong came when I tried to tune the weights. No matter how I adjusted them, sometimes just a random noise image got higher score than an almost perfect clone. ‍ Then I remembered I know Deep Learning... While i was doing trialtandherror with the weights, I remembered I can use machine learning to optimize the weights for me. The best metric I had so far was the Pixel Wised Errors, and the only issue was that the system was fundamentally not learning what \"similar\" meant for this task., and failed with a small misalignment. If I could train a model that learns the \"space of valid variations\" of the target image, then candidates that fall within that learned manifold should score well, and candidates that fall outside should score poorly. The reconstruction error of an autoencoder trained on augmented versions of the target would be exactly this measure. I decided to train a small convolutional autoencoder for each target image. The challenge was that I only had 1 target image, so I needed to generate diverse examples that capture different variations while avoiding overfitting to pixelperfect identity (we call this method Data Augmentation). I created 250 augmented images from the target image using the following techniques: Horizontal and Vertical flip Random rotations Color jitter (brightness, contrast, saturation, hue) Perspective distortion Random crops with padding ‍ ‍ The Autoencoder Architecture The architecture I built was simple yet powerful. The encoder starts with the input image at 256x256 pixels and progressively compresses it through four convolutional layers, each with a stride of 2 that halves the spatial dimensions while doubling the number of channels (from 3 initial RGB channels up to 512 in the deepest layer), this downsampling transforms the image from a detailed picture into a compact 16x16 feature map, with batch normalization and ReLU activations ensuring stable, efficient learning along the way. It's like distilling the visual essence, stripping away the noise to focus on the core structure. ‍ At the middle of the model I added the bottleneck, a deliberate constraint that forces the network to make tough choices. Here, the spatial features are flattened and squeezed through a dense layer into a mere 128dimensional latent vector. This compression is actually intentional, it's not about preserving every pixel but about capturing the semantic structure that defines the target image. I could have opted for a UNet with skip connections, which would reconstruct images almost perfectly, but that would defeat the purpose. By funneling everything through this narrow bottleneck, the model learns to prioritize what's actually important, discarding small details that don't contribute to the image's identity. Finally the decoder mirrors the encoder's path, but in reverse. It starts by expanding the 128dimensional vector back into spatial dimensions, then uses four transpose convolutional layers to upsample from 16x16 back to the original 256x256 resolution. The final layer outputs a reconstructed RGB image, ready for comparison. The loss function is straightforward: the same pixel wise mean squared error between the input and its reconstruction. But the magic isn't in the loss itself this time, it's in what the model learns from the training data, how it internalizes the variations that are acceptable for this particular target. What makes this bottleneck so crucial is its role in discriminative power. Images that share the target's essential structure will reconstruct with low error, while those that deviate significantly will struggle. The stride based downsampling ensures the model sees global patterns at a 16x downsampling level, encouraging it to learn overarching structure rather than memorizing local pixels. Batch normalization and ReLU keep training stable and fast, and the model's modest size (relative to today's giant models) prevents it from overfitting to pixel perfect replicas, instead fostering meaningful, generalizable embeddings. ‍ Training the Model and Scoring Submissions Training this autoencoder was super easy, much better than the complicated weighting schemes I had wrestled with earlier. I used mean squared error as the loss, paired with an Adam optimizer that included weight decay to prevent overfitting. A plateau scheduler adjusted the learning rate based on validation performance, and I saved the best checkpoint to ensure I captured the model's peak understanding of the target (Basically every best practice technique for deep learning) For each target image, I trained on 250 augmented samples, and it ran for 100 epochs, but thanks to Metal Performance Shaders on my Mac (M1), it completed in just a few minutes, a testament to the model's efficiency. ‍ Once trained, scoring a candidate image became elegantly simple. I'd preprocess it to match the training setup (resizing and normalizing) then feed it through the autoencoder. The mean squared error between the original and reconstructed image served as the similarity score: lower error meant higher similarity to the target. This approach was not only accurate but interpretable, I could visualize the reconstruction and generate per pixel error heatmaps to pinpoint exactly where the model detected discrepancies, turning the scoring process into a transparent dialogue between machine and human perception. Here is the same attempt we worked on in previous methods: ‍ ‍ The beauty of this system showed when I tested it on the full set of candidate images, the same meme variations I'd used to challenge the classical methods. The autoencoder didn't just rank them, it understood them. Images that captured the essence of the Girl with a Pearl Earring scored high, while those that strayed too far into parody or irrelevance fell by the wayside. And with the error heatmaps, I could see the model's reasoning laid bare, highlighting unexpected regions and confirming that the scoring aligned with human intuition. ‍ ‍ Running the model across all candidates revealed a clear hierarchy of similarity, one that felt natural and fair. The top submissions were those that balanced fidelity to the original with creative interpretation, while the outliers were penalized appropriately. This wasn't just a technical victory, it was a validation of the approach, proving that by training a model to learn the target's manifold, I could create a scoring system that was both robust and humanlike in its judgments. ‍"
    },
    {
      "id": "project:one-o-one",
      "type": "project",
      "title": "One-o-One Betting: Peer-to-Peer Crypto Sportsbook",
      "slug": "one-o-one",
      "url": "/projects/one-o-one.html",
      "sourcePath": "projects/one_o_one.md",
      "tags": [
        "Blockchain",
        "Smart Contracts",
        "Sportsbook"
      ],
      "description": "A decentralized sports betting MVP built around peer-to-peer handshakes, token conversion, and oracle-backed settlement.",
      "date": "2026-06-02T13:18:40.000Z",
      "text": "One-o-One Betting: Peer-to-Peer Crypto Sportsbook A decentralized sports betting MVP built around peer-to-peer handshakes, token conversion, and oracle-backed settlement. OneoOne Betting: PeertoPeer Crypto Sportsbook | Behnam Khorsandian In early 2023, I was approached by a client with an interesting idea: a decentralized sports betting platform where friends could bet with each other using any cryptocurrency, maintain complete anonymity, and trust the system without a centralized authority holding their funds. He needed a proof of concept to validate the idea before investing anything on it. His initial idea was to ditch the bookies, and directly bet with your friends, but after few brainstorming sessions, we evolved the idea into a more interesting product. The platform had two betting modes: open bets anyone could accept (like a market), and exclusive bets directed at specific users. My job was to build an MVP over six months that proved this could actually work on blockchain. The challenge was not just making it functional, but solving several technical problems that made similar platforms fail or become too expensive/difficult to use. The Handshake Protocol I called the core betting mechanism the Handshake Protocol, treating each bet as an agreement between two parties that gets enforced by the blockchain. A handshake has a simple lifecycle: someone creates it and locks their funds, another person accepts it and locks matching funds, the match happens, and the winner gets paid automatically. ‍ ‍ The smart contract organizes everything around fixtures (sporting events) and handshakes (bets). Each fixture has a betting window and a result that gets updated after the match. Each handshake stores who is betting, what they are betting on, how much, and its current state. When someone creates a bet, they choose between exclusive (only a specific person can accept using their wallet address) or open (anyone can accept). Funds lock immediately. If the other party rejects or never responds before the match starts, the initiator gets refunded. Once accepted, both parties are committed. After matches finish, the contract resolves all bets at once in a batch operation, determines winners, takes a percentage as platform commission, and pays out. For draws or postponed matches, everyone gets refunded. The hardest part was not the edge cases where users try to game the system or transactions fail at the wrong moment. The MultiToken Problem and Tokenomics The client wanted users to bet with any token they owned, ETH, USDC, SOL, BTC, whatever. But this created a nightmare scenario for the smart contract. If user A bets 0.1 ETH on team A and user B bets 150 USDC on team B, how do you determine the winner's payout? What if ETH price crashes between bet placement and resolution? You cannot fairly resolve crosstoken bets without a stable reference point. My initial approach was to integrate Uniswap directly into the betting contract, so every incoming bet would swap the deposited token for USDC at the moment of acceptance. The problem with this was gas costs (it was very high back then). Swapping tokens on Uniswap inside a contract call is expensive, and when you combine that with the handshake logic, bet placement was costing users huge amount of gas. That is completely unusable for small bets. So, I designed a twotoken system with an intermediary platform token. Here is how it works: users deposit any token into a separate conversion contract (not the betting contract). The Native Token contract uses Chainlink price feeds to calculate the USD value of the deposited token, then mints an equivalent amount of platform tokens to the user's wallet. The platform token is pegged 1:1 to USD via this minting mechanism, so it acts like a stablecoin but without the regulatory complexity of being an actual stablecoin. ‍ ‍ Making it Gasless Asking users to pay gas fees for every action kills adoption. When gas prices spike, a simple bet acceptance can cost more than the bet itself. The client wanted it to feel like a normal website where you just click buttons. The solution was metatransactions through a relayer. Users sign messages with their wallet instead of sending transactions. These signatures go to a backend relayer service that wraps them in real transactions and pays the gas. The smart contract verifies the signature and executes the action as if the user sent it directly. The user never pays gas, the platform does. ‍ ‍ The security challenge was preventing the relayer from stealing funds or replaying signatures. I used nonces (incrementing counters) so each signature only works once, and added expiration timestamps so old signatures become invalid. The relayer is just a messenger with no special privileges, all fund transfers go to addresses recovered from the user's signature, not the relayer address. The relayer is a simple Node.js service that validates signatures, simulates transactions locally to catch failures before wasting gas, then submits them to the network. For the MVP I ran it on a basic VPS. The platform pays gas costs but offsets this with the 10 percent commission on bets. Users can still interact with the contract directly if they want, but most people just use the gasless option. ‍ The Oracle Problem Getting sports results onto the blockchain was the hardest challenge. Blockchains cannot access external data directly, they need oracles. Chainlink is the standard, and they offer sports data oracles that aggregate match results from multiple sources. The problem is coverage. Chainlink only covers major leagues like NFL, NBA, and Premier League. The client wanted smaller leagues and basically all sports. I checked alternative oracle networks like Band Protocol and API3, but coverage was even worse. Oracle providers focus on price feeds because that is where demand is, sports betting is still a small market. I designed a hybrid system. Major fixtures use Chainlink oracles, fully decentralized and trustless. For everything else, we fall back to a manual oracle where a validator (initially the client) submits results. Not ideal, but I added safeguards: results can only be submitted after matches end, anyone can dispute a result by staking tokens, and all updates emit events so users can monitor for manipulation. ‍ ‍ I know, the manual part undermines decentralization, but it was necessary for the MVP. The architecture supports transitioning to full oracle coverage as networks improve. I built a monitoring dashboard that compares submitted results against multiple sports APIs and flags discrepancies automatically. Chainlink oracles are also asynchronous, you send a request in one transaction and receive the result in a callback later. This means match results and bet resolution happen in separate transactions, adding latency but keeping things decentralized. I added a fallback where the validator can manually submit results after 24 hours if the oracle fails, preventing bets from getting stuck indefinitely. ‍ Testing and Launch Testing on Ethereum Sepolia testnet revealed bugs that would have been catastrophic with real money. One early version allowed users to overpay when accepting bets, and the excess would be locked forever. Another version had a race condition where two people could accept the same open bet simultaneously. I wrote over 20 test cases covering normal flows, edge cases, and attack scenarios. For the MVP, we deployed on Ethereum Sepolia and Polygon Mumbai testnets. The frontend connected to MetaMask and let users place bets, accept bets, and view results using mock fixture data. Gas costs on Polygon were 100x cheaper than Ethereum, making it the obvious choice for mainnet. ‍ ‍ I set up monitoring to index contract events, giving us a queryable API for bet history, user activity, and platform revenue without constantly hitting the blockchain. The frontend used this for leaderboards and activity feeds. ‍ Lessons Learned (Images are from the Protptype) After six months, I delivered a working proof of concept. The client could show investors that the core idea worked: users bet with any crypto, the platform converts everything to a stable token, matches settle fairly, and no centralized party holds user funds. The handshake protocol handled hundreds of test bets without issues. ‍ ‍ The biggest lesson was that blockchain development is 80 percent edge cases. The happy path is easy, the hard part is handling all the ways things can break. Users trying to exploit bugs, network congestion breaking oracles, tokens with weird behaviors, race conditions in concurrent transactions. Every state transition needs bulletproof validation. ‍ ‍ Oracles are still the weakest link. Chainlink is great but has coverage gaps, and falling back to manual data entry undermines decentralization. If I built this again, I would spend more time securing sports data partnerships or designing a decentralized validator network with proper incentives. ‍ The metatransaction pattern worked well for user experience, but running a relayer is operationally complex. You need monitoring, reserves, gas price strategies, and graceful failure handling. For production I would use a service that already exists (even more than the time I was working on this project) instead of maintaining infrastructure. The MVP gave the client what he needed for his team to complete the product. The core contracts were the foundation of the platform, though they have been audited and extended later. It was a challenging six months that pushed my understanding of smart contract design, tokenomics, and the practical limitations of decentralized systems. ‍"
    },
    {
      "id": "project:sina-robotics",
      "type": "project",
      "title": "Sina Robotics: Designing Surgical Robots for International Markets",
      "slug": "sina-robotics",
      "url": "/projects/sina-robotics.html",
      "sourcePath": "projects/sina_robotics.md",
      "tags": [
        "Robotics",
        "Medical Devices",
        "Industrial Design"
      ],
      "description": "Industrial design leadership for robotic telesurgery hardware, surgical tools, ergonomics, and regulatory-ready production.",
      "date": "2026-06-02T13:18:40.000Z",
      "text": "Sina Robotics: Designing Surgical Robots for International Markets Industrial design leadership for robotic telesurgery hardware, surgical tools, ergonomics, and regulatory-ready production. Sina Robotics: Designing Surgical Robots for International Markets | Behnam Khorsandian In 2019, I joined Sina Robotics as an Industrial Designer to work on a robotic telesurgery system. The company had spent a decade in R&D developing medical robotics and training simulators, and was preparing to commercialize their first surgical robot. Three months in, I was promoted to Design Manager, leading a team of four Industrial Designers and Mechanical Engineers. The goal was to improve system usability while meeting international medical device standards. That meant increasing range of motion in the robotic arms, designing surgical instruments and detection systems, and optimizing the surgeon console ergonomics. Within a year, we delivered the designs that helped close the first international deal and positioned the company as one of the few nations with certified robotic telesurgery systems. [PLACEHOLDER: Hero image of surgical robot system] The Task The core challenge was balancing innovation with certification requirements. Medical devices need ISO13485 (quality management), IEC80601 (safety standards), and various ISO/TC 299 regulations before they can be sold internationally. Every design decision had to satisfy both engineering constraints and regulatory documentation. The existing system had usability problems reported by R&D during testing. Robotic arms had limited range, making certain surgical positions impossible. The surgeon console was ergonomically flawed, leading to fatigue during long procedures. There was no automated tool detection, forcing manual configuration before each operation. And manufacturing costs were higher than target markets would tolerate. [PLACEHOLDER: Technical diagram showing robotic arm range improvements] My team worked on four parallel tracks: mechanical redesign of the robotic arms to extend reach without sacrificing precision, development of surgical instruments with standardized interfaces, integration of tool detection sensors into the instrument holders, and ergonomic optimization of the control handles and console dimensions. The robotic arm redesign required collaboration with mechanical engineers to model joint configurations that increased workspace volume while maintaining submillimeter accuracy. We tested dozens of configurations in simulation before prototyping. The challenge was not just achieving range, it was doing so without adding weight that would slow response times or require stronger actuators. [PLACEHOLDER: CAD models comparing old vs new arm configurations] Tool Detection and Surgical Instruments Designing the surgical instruments meant understanding how surgeons interact with tools during procedures. We studied existing robotic surgery platforms and interviewed surgeons to identify pain points. The result was a modular instrument design with quickrelease mechanisms and standardized mechanical interfaces. The tool detection system used a combination of RFID tags embedded in each instrument and proximity sensors in the robotic arm holders. When a surgeon loaded an instrument, the system automatically identified the tool type and configured control parameters. This eliminated setup time and reduced human error. [PLACEHOLDER: Exploded view of surgical instrument with RFID integration] The instruments themselves needed to be costeffective to manufacture while meeting sterilization and durability requirements. We chose materials compatible with autoclave cycles and designed geometries that minimized stress concentrations during repeated use. Every component was documented for regulatory submission. Ergonomics and the Surgeon Console The surgeon console is where operators spend hours in delicate procedures. Poor ergonomics translate directly to fatigue, reduced precision, and potential errors. We analyzed anthropometric data for target markets, built mockups at different scales, and conducted user testing with surgeons. [PLACEHOLDER: Ergonomic analysis diagrams of console workspace] The control handles were the most critical interface. We iterated on grip angle, button placement, haptic feedback integration, and force response curves. The final design accommodated a wide range of hand sizes and allowed natural wrist positions during extended use. We also adjusted the console height, screen positioning, and foot pedal placement based on percentile data. Prototyping was rapid. We used 3D printing for form validation and CNC machining for functional prototypes that surgeons could actually test. Feedback cycles were tight, sometimes turning around modifications in days. [PLACEHOLDER: Photos of console prototype iterations] Manufacturing and Cost Optimization Meeting international price targets required aggressive cost reduction without compromising quality. I supervised exterior part fabrication and instrument production, evaluating different manufacturing methods for each component. For plastic housings, we switched from CNC machining to injection molding after verifying production volumes justified tooling costs. For metal components, we optimized material selection and simplified geometries to reduce machining time. Some parts moved from metal to reinforced polymers where structural requirements allowed. [PLACEHOLDER: Manufacturing process comparison charts] The result was a 10 percent reduction in overall production costs and a 30 percent decrease in fabrication expenses. These savings made the system competitive in pricesensitive markets like Indonesia, where we closed the first international deal. Certification and Standards Every design change fed into regulatory documentation. ISO13485 requires traceability from requirements through validation, so we maintained detailed records of design decisions, test results, and material certifications. IEC80601 governs electrical safety and performance, which influenced component selection and wiring design. We worked closely with the compliance team to ensure drawings, BOMs, and test reports met documentation standards. Any modification triggered a review process to assess regulatory impact. This added overhead but prevented expensive redesigns later. [PLACEHOLDER: Certification timeline diagram] The R&D team identified issues during testing that had regulatory implications. For example, early prototypes had electromagnetic interference problems that violated IEC standards. We redesigned shielding and grounding schemes, then revalidated. These iterations were timeconsuming but necessary. The First International Deal After a year of parallel development, we delivered complete design packages to R&D. The improved robotic arms, surgical instruments, tool detection system, and ergonomic console came together in a system that met certification requirements and user needs. [PLACEHOLDER: Final assembled system photo] The timing was tight. Indonesia was evaluating multiple robotic surgery platforms, and our system competed against established players. The cost optimization work made our pricing competitive, while the usability improvements and certifications demonstrated maturity. We won the deal, becoming one of the first countries to export robotic telesurgery systems. Lessons Learned Managing a design team in a regulated industry is different from typical product design. Every decision has compliance implications, and iteration is expensive. The key is frontloading research so you get closer to the right answer on the first attempt. Collaboration between industrial designers and mechanical engineers was essential. Designers pushed for user experience and manufacturability, engineers enforced structural and performance constraints. Tension between these priorities led to better solutions than either discipline alone. [PLACEHOLDER: Team collaboration workspace photo] The manufacturing optimization taught me that cost reduction is not about cutting corners, it is about understanding processes deeply enough to find inefficiencies. Changing a fillet radius or material grade can save thousands without affecting performance. Certification timelines dominate product development in medical devices. Features that seem simple can take months to validate and document. Planning around these constraints is critical, you cannot just iterate fast and fix things later. The project positioned Sina Robotics as a serious player in surgical robotics. The system we designed formed the foundation for their commercial product line and demonstrated that domestic engineering could compete internationally in advanced medical technology."
    },
    {
      "id": "service:ai-tuning",
      "type": "service",
      "title": "AI Tuning",
      "slug": "ai-tuning",
      "url": "/services/ai-tuning.html",
      "sourcePath": "pages/services/ai-tuning.html",
      "tags": [
        "Agentic AI",
        "Chatbots",
        "Fine-tuning"
      ],
      "description": "You already adopted AI. But it still feels like a demo. We engineer it to production quality, accurate, on-brand, and actually useful.",
      "text": "AI Tuning You already adopted AI. But it still feels like a demo. We engineer it to production quality, accurate, on-brand, and actually useful. Agentic AI, Chatbots, Fine-tuning AI Tuning | unshaped Menu ping me Services Projects Cookbooks Blogs Contact Services / AI Tuning 01 - service AI Tuning You shipped an AI feature and everyone liked it. Then real users showed up. Now it contradicts itself, invents facts, and answers in a voice that sounds nothing like you. That gap between the demo and the daily reality is the whole problem. We close it, so the same question gets the same answer, the tone is yours, and your team trusts it enough to put in front of customers. Book a call See the work What you walk away with An AI that gives the same answer twice, not a coin flip. Outputs in your voice, grounded in your real knowledge. A system your team trusts enough to actually ship. How it works Step 1 Audit Find where it breaks, and why. Step 2 Engineer RAG, fine-tuning, guardrails, and evals. Step 3 Ship Deployed, monitored, and handed over. Proof GoLEAD - EdTech Zacky, a production RAG agent that builds courses and exam questions from raw PDFs. 90% question-writing time saved RAG Agents Fine-tuning @ss/intel - lead intelligence An LLM that audits content and writes SEO metadata from inside the CMS. 6 SEO dimensions audited in-editor LLM Embeddings Edge Under the hood RAG and knowledge assistants Fine-tuning Agentic workflows Evals and guardrails Self-hosted models MLOps deployment Is your AI problem fixable? Questions Do you build on OpenAI, or self-hosted models? Both. We choose based on your data sensitivity, cost, and latency. How long until it is in production? A scoped pilot ships in weeks, not quarters. The audit comes first. Tell me what your AI keeps getting wrong. That is a conversation worth having. Book a call Explore next service Data Clarity UnShaped &copy; 2025 ping me WhatsApp Telegram Privacy Policy Home Services Projects Cookbooks Blogs Contact Connect ping me WhatsApp Telegram"
    },
    {
      "id": "service:data-clarity",
      "type": "service",
      "title": "Data Clarity",
      "slug": "data-clarity",
      "url": "/services/data-clarity.html",
      "sourcePath": "pages/services/data-clarity.html",
      "tags": [
        "Business Intelligence",
        "Data Analytics",
        "Machine Learning"
      ],
      "description": "You're sitting on data. You just can't see what it's telling you. We turn them into stories which you know the next act already.",
      "text": "Data Clarity You're sitting on data. You just can't see what it's telling you. We turn them into stories which you know the next act already. Business Intelligence, Data Analytics, Machine Learning Data Clarity | unshaped Menu ping me Services Projects Cookbooks Blogs Contact Services / Data Clarity 02 - service Data Clarity You are sitting on years of data. Sales logs, user events, spreadsheets nobody opens. Somewhere in there is the reason your best month was your best month, and you cannot see it. So every decision becomes a guess dressed up as a hunch. We turn that pile into a story so clear you already know the next move, with dashboards that answer questions and models that tell you what is likely to happen next. Book a call See the work What you walk away with One view that answers your real questions, not ten dashboards that raise more. A forecast of what is likely next, not just a report of what already happened. Decisions backed by the number, not the loudest opinion in the room. How it works Step 1 Consolidate Pull the scattered data into one place. Step 2 Model Find the patterns and build the forecast. Step 3 Tell the story Dashboards anyone can read and act on. Proof Quantoshi - fintech A predictive trading engine that reads market timeseries and acts on the signal. 24/7 automated signal monitoring Timeseries Predictive models Explainable AI @ss/intel - lead intelligence A time-decay lead-scoring model that ranks which leads are worth calling now. 1st leads sorted by likelihood to close Scoring Supervised learning Feature engineering Under the hood Business intelligence Dashboards Predictive modeling Timeseries forecasting Data pipelines Explainable AI Is your data ready to talk? How is most of your data stored today? Scattered spreadsheets and exports A few separate apps and tools One central database or warehouse Questions Do we need clean data before we start? No. Cleaning and consolidating the mess is usually step one, and it is the part we are good at. Is this just dashboards, or actual prediction? Both, in that order. We make the past readable first, then build the forecast on top. Tell me what your data should be telling you. That is a conversation worth having. Book a call Explore next service Solution Architecture UnShaped &copy; 2025 ping me WhatsApp Telegram Privacy Policy Home Services Projects Cookbooks Blogs Contact Connect ping me WhatsApp Telegram"
    },
    {
      "id": "service:digital-twin",
      "type": "service",
      "title": "Digital Twin",
      "slug": "digital-twin",
      "url": "/services/digital-twin.html",
      "sourcePath": "pages/services/digital-twin.html",
      "tags": [
        "SaaS",
        "Marketplace",
        "Platforms",
        "Redesign"
      ],
      "description": "Enabling founders and operators with a digital, scalable, and owned version of their business. Gateway to Digital Transformation.",
      "text": "Digital Twin Enabling founders and operators with a digital, scalable, and owned version of their business. Gateway to Digital Transformation. SaaS, Marketplace, Platforms, Redesign Digital Twin | unshaped Menu ping me Services Projects Cookbooks Blogs Contact Services / Digital Twin 04 - service Digital Twin You built this business by hand, and that is exactly the problem. The knowledge lives in your head. The process runs because you run it. Take a week off and it wobbles. You cannot grow past your own hours, because the business is wearing you as a load-bearing wall. We turn how you operate into owned software. The repeatable parts become a system that runs and scales without you, and belongs to you outright. Book a call See the work What you walk away with Your process as software that runs whether you are in the room or not. A business that can take on more without you working more. Software you own as an asset, not a subscription you rent forever. How it works Step 1 Map Trace how the business actually runs. Step 2 Build Turn the repeatable parts into software. Step 3 Hand over Owned, documented, and yours to scale. Proof ShapeShifters - lead intelligence platform Turned a manual lead workflow into an owned SaaS platform running at the edge. SaaS manual workflow, now software SaaS Platform Cloudflare Workers GoLEAD - EdTech platform A repeatable course-and-exam process, rebuilt as a platform anyone on the team can run. 90% of a manual task automated Platform Automation Multi-user Under the hood SaaS Platforms Marketplaces Workflow automation Edge compute Process design How many hours could you reclaim? People doing the manual work Hours each, per week Share that is repetitive Hours reclaimed / month At a $40/h loaded cost Questions Do I own the software, or lease it from you? You own it. That is the whole point. It is an asset on your side, not a subscription on ours. What if my process is too messy to automate? Messy is normal. Mapping it honestly is step one, and often the most valuable part by itself. Tell me the part of the business only you can run. That is a conversation worth having. Book a call Explore next service AI Tuning UnShaped &copy; 2025 ping me WhatsApp Telegram Privacy Policy Home Services Projects Cookbooks Blogs Contact Connect ping me WhatsApp Telegram"
    },
    {
      "id": "service:solution-architecture",
      "type": "service",
      "title": "Solution Architecture",
      "slug": "solution-architecture",
      "url": "/services/solution-architecture.html",
      "sourcePath": "pages/services/solution-architecture.html",
      "tags": [
        "Strategy",
        "Brainstorming",
        "Roadmap",
        "Consulting"
      ],
      "description": "You have the team. You need the direction. Senior-level technical leadership and problem solving on a retainer basis.",
      "text": "Solution Architecture You have the team. You need the direction. Senior-level technical leadership and problem solving on a retainer basis. Strategy, Brainstorming, Roadmap, Consulting Solution Architecture | unshaped Menu ping me Services Projects Cookbooks Blogs Contact Services / Solution Architecture 03 - service Solution Architecture You have engineers. What you do not have is the senior voice in the room who has shipped this before and can tell you which path is a trap before you walk down it. Hiring a full-time CTO at this stage is heavy, slow, and expensive. Flying blind on the big calls is more expensive, just later. This is senior technical leadership on a retainer: the judgment without the equity and the payroll. Book a call See the work What you walk away with A clear technical roadmap your team can actually build against. The expensive mistakes caught on a whiteboard, not in production. A senior partner on call for the decisions that are hard to reverse. How it works Step 1 Orient Understand the business and the stack. Step 2 Direct Architecture, roadmap, and hard calls. Step 3 Stay on call Retainer support, not a one-off report. Proof GoLEAD - EdTech platform Architected a multi-service EdTech platform from zero, including its RAG agent. 0 to 1 architecture, built and shipped Microservices System design AI integration Sina Robotics - surgical robotics Technical leadership in a high-stakes domain where errors are not an option. 1 surgical-grade vision system Image processing OpenCV Hard domain Under the hood Fractional CTO System architecture Technical strategy Roadmap Technical due diligence Team mentoring Fractional vs full-time Days per week Fractional, monthly Full-time equivalent Questions Do you write code, or just advise? Both, as needed. The point is senior judgment. Sometimes that means architecture, sometimes it means getting hands dirty on the hard part. What does a retainer actually look like? A set number of days a week or month, scoped to your stage. It flexes as you grow. Tell me the technical decision keeping you up. That is a conversation worth having. Book a call Explore next service Digital Twin UnShaped &copy; 2025 ping me WhatsApp Telegram Privacy Policy Home Services Projects Cookbooks Blogs Contact Connect ping me WhatsApp Telegram"
    }
  ]
}