Skip to content

Abstraction of Data Types: From Interfaces to Algebraic Data Types

About 7078 wordsAbout 24 min

Programming LanguagesType SystemsSoftware Design

2025-09-19

Info

This article originated from a casual conversation. Not long ago, I was asked an object-oriented design question on WeChat. While answering, I suddenly had some ideas that I found quite interesting, so I wrote them down to share with everyone. Please forgive me for ending with some philosophical talk.

First, let me reconstruct the scene from that time. I'll directly paste the conversation content:

Discussion about instanceof design issues

2025-09-18

Someone

I have a simple OOP question to ask.

What's up?

Someone

I was working on a FIT2099 object-oriented design assignment and needed to implement a function that behaves differently based on object types. But the feedback I got from my tutor was that according to this course's standards, using instanceof and type casting is not allowed. I need to redesign it.

According to this course's standards, using instanceof will result in heavy deductions because it represents that you haven't made good use of polymorphism, composition, and the type system.

Someone

How can I avoid using instanceof? I'm dealing with an interface class here, which is also a layer of abstraction, not a concrete implementation class.

This is perhaps a design issue worth exploring. In engineering, we might indeed treat this as technical debt, implement it with instanceof first, and refactor later. If it's just a small feature, this might be acceptable. However, as our codebase grows larger and more complex, this design can bring many troubles. Therefore, under this course's code purity requirements, we need to avoid using instanceof. instanceof is incomplete. Unless we confirm the logic branches are complete through other means, we always risk runtime errors.

Someone

So what specifically should I do?

You can consider using generics to let the compiler help you dispatch behavior through type parameters. Or, you can use sealed classes in Java 17 to define a closed type hierarchy, allowing the compiler to help you check completeness. If you want to be more conservative and not use too many new features, you can also discuss with your team members and establish a mutually accepted enum type. Then you can get this enum type through some method of the object, allowing you to bypass instanceof for some logic judgments. You can also consider using some design patterns, like Strategy, Observer, Visitor, etc., to avoid hardcoded type checking through object-oriented abstraction.

This conversation reveals a deep programming language design problem: How can we maintain code flexibility while letting the compiler help us discover errors?

The essence of the instanceof problem is actually a trade-off between runtime type checking and compile-time type safety. When we write code like this:

if (obj instanceof String) {
    // Handle strings
} else if (obj instanceof Integer) {
    // Handle integers
} else if (obj instanceof Double) {
    // Handle doubles
}

We're essentially telling the compiler: "Trust me, I'll handle all possible cases." But if a new type is added, the compiler can't remind us to update this logic. This is the risk brought by incompleteness.

This problem can be more precisely framed within the Expression Problem. The Expression Problem describes the challenge in programming languages of how to easily extend both new data types and new operations:

  • Object-oriented nominal subtyping: Easy to extend with new types, but unfriendly to extending with new operations
  • Algebraic data types (sum types): The opposite—easy to extend with new operations, but less friendly to adding new types

When we use instanceof chained branches, we're essentially extending along the operation dimension, but this extension cannot trigger compiler reminders when new type variants are added—this is a typical manifestation of the "new type" difficulty in the Expression Problem.

This article will take you on a journey from specific problems to abstract solutions. We'll see how different programming languages solve this seemingly simple yet deeply challenging problem in their own ways. This isn't just a language comparison, but understanding the evolution of programming language design philosophy.

In programming, abstraction is an important concept. It helps us manage complexity, making code easier to maintain and extend. Abstraction lets us hide implementation details and focus on modules' core functionality and behavior, without being distracted by overly specific details. Both functional programming and object-oriented programming value abstraction, just with different implementation approaches.

Now, let's start by understanding the essential motivation of abstraction, and gradually explore how various languages solve this instanceof dilemma.

Data type abstraction runs through the entire history of programming languages. All languages face the same problem: How do we describe the shape and behavior of data so that compilers/interpreters can cooperate with humans to build reliable systems? This article will take you on a conceptual journey, starting from mainstream object-oriented techniques and moving toward more expressive functional paradigms, showing how these concepts accumulate rather than replace each other.

The Essential Motivation of Abstraction: From Concrete to General

Before diving into the abstraction mechanisms of various programming languages, let's first understand a fundamental question: Why do we need abstraction?

The essential motivation of abstraction stems from our need to manage complexity. When faced with a problem, there are often multiple implementation ways to achieve the same goal. Abstraction allows us to:

  1. Hide implementation details - Users only need to care about "what can be done," not "how to do it"
  2. Unify operation interfaces - Different implementations can be accessed through the same interface
  3. Facilitate replacement and extension - Implementations can be changed without affecting users

Returning to the original instanceof problem, the core goal of abstraction is to convert runtime type judgments into compile-time type guarantees. Let's understand this through a simple example.

Shape Processing Abstraction

Suppose we need to handle area calculations for different shapes. Without abstraction, we might write:

def calculate_area(shape):
    if isinstance(shape, Circle):
        return 3.14159 * shape.radius ** 2
    elif isinstance(shape, Rectangle):
        return shape.width * shape.height
    elif isinstance(shape, Triangle):
        return 0.5 * shape.base * shape.height
    # If a Square type is added later, it's easily missed here!

The problems with this code are obvious:

  • Every time a new shape is added, this function must be modified
  • The compiler cannot check if all cases are handled
  • It violates the "Open-Closed Principle (OCP)" (open for extension, closed for modification)

Abstract Solutions

Through abstraction, we can convert this runtime judgment into compile-time guarantees:

from abc import ABC, abstractmethod

class Shape(ABC):
    @abstractmethod
    def area(self) -> float:
        pass

class Circle(Shape):
    def __init__(self, radius: float):
        self.radius = radius

    def area(self) -> float:
        return 3.14159 * self.radius ** 2

class Rectangle(Shape):
    def __init__(self, width: float, height: float):
        self.width = width
        self.height = height

    def area(self) -> float:
        return self.width * self.height

# Now can handle all shapes uniformly, no instanceof needed
def calculate_total_area(shapes: list[Shape]) -> float:
    return sum(shape.area() for shape in shapes)

The Value of Abstraction

This simple example reveals the true value of abstraction:

  1. Compile-time guarantees - Every shape must implement the area() method, checked by the compiler
  2. Extension-friendly - Adding new shapes only requires adding new classes, no need to modify existing code
  3. Runtime safety - No longer worry about missing certain cases
  4. Code clarity - Each class has clear responsibilities, following the Single Responsibility Principle

The essence of abstraction is converting runtime judgments like "if it's X, do Y" into compile-time guarantees like "X knows how to do Y". This not only reduces errors but also makes code easier to understand and maintain.

Now, let's see how different programming languages implement this abstraction. We'll start with the most traditional object-oriented solutions and gradually explore more modern approaches.

Now let's start exploring specific language implementations to see how they solve the instanceof problem we began with.

Why Care About Type Abstraction

Before diving into specific languages, let's consider the two goals that any data abstraction pursues:

  1. Encapsulation of invariants — Let the compiler help us prevent invalid states from occurring
  2. Composability — Let different modules and teams exchange data without binding to specific implementations

Although syntax varies by language paradigm, the underlying motivations remain consistent. Next, we'll compare how these goals manifest in Java, C++, Kotlin, TypeScript, and Haskell.

Java's Interfaces and Generics: Behavior-Centric

As one of the oldest and most widely used object-oriented languages, Java provides the first systematic solution to our abstraction problem. Java's interface and generic mechanisms directly respond to the instanceof dilemma we started with.

Since Java's class inheritance and abstract classes are similar to the Python example above, we won't provide specific examples here. We'll start directly with Java's interfaces and generics.

Java early on mainly used interfaces to implement abstraction. Interfaces describe a set of behavioral contracts, allowing classes to promise they implement certain methods. This is helpful for dependency inversion and modular design, but pays relatively less attention to the specific implementation of data.

The Essence of Interfaces: Behavioral Polymorphism

In Java, interfaces are not just collections of methods; they are more a form of behavioral polymorphism. Let's look at a more complete example:

// Define a payment processor interface
interface PaymentProcessor {
    boolean processPayment(Payment payment);
    void refundPayment(String transactionId);
    PaymentStatus getPaymentStatus(String transactionId);
}

// Different implementation methods
class CreditCardProcessor implements PaymentProcessor {
    public boolean processPayment(Payment payment) {
        // Credit card payment logic
        return validateCard(payment) && chargeCard(payment);
    }

    public void refundPayment(String transactionId) {
        // Credit card refund logic
        refundToCard(transactionId);
    }

    public PaymentStatus getPaymentStatus(String transactionId) {
        // Query credit card payment status
        return queryCardStatus(transactionId);
    }
}

class PayPalProcessor implements PaymentProcessor {
    public boolean processPayment(Payment payment) {
        // PayPal payment logic
        return authenticateWithPayPal(payment) && executePayment(payment);
    }

    public void refundPayment(String transactionId) {
        // PayPal refund logic
        refundViaPayPal(transactionId);
    }

    public PaymentStatus getPaymentStatus(String transactionId) {
        // Query PayPal payment status
        return queryPayPalStatus(transactionId);
    }
}

// Code using the interface, not caring about specific implementation
class PaymentService {
    private PaymentProcessor processor;

    public PaymentService(PaymentProcessor processor) {
        this.processor = processor; // Dependency injection
    }

    public boolean handlePayment(Payment payment) {
        return processor.processPayment(payment);
    }
}

This example demonstrates several important characteristics of interfaces:

  1. Behavioral unity: All payment processors have the same method signatures
  2. Implementation diversity: Different payment methods have different internal implementations
  3. Decoupling: PaymentService only depends on the interface, not specific implementations
  4. Easy testing: PaymentProcessor can be mocked to test PaymentService
interface Renderer {
    void render(Document doc);
}

class HtmlRenderer implements Renderer {
    public void render(Document doc) { /* ... */ }
}

Interfaces make it convenient for us to replace different implementations, but don't directly explain what Document actually contains. As systems grew larger, Java 5 introduced generics to provide compile-time type parameters:

Generics can be simply described as types of types, allowing us to parameterize different types onto the same interface:

interface Repository<T> {
    void save(T entity);
    Optional<T> findById(UUID id);
}

In this code, Repository<T> abstracts some entity type T, and directly tells the compiler that during checking, the save method can only accept parameters of type T, and the Optional returned by findById can only contain values of type T. This lets us write more generic code without needing to write a new interface for each entity type.

Generics bring stronger static type guarantees, with the compiler able to confirm that repositories only handle consistent entity types. But Java generics have the characteristics of nominal typing and type erasure:

  • Type erasure is the essence of Java generics, affecting runtime reification and certain specialization optimizations
  • For reference type parameters, there's typically no automatic overhead
  • For primitive type parameters, since they cannot be type arguments, boxing/unboxing and potential escape analysis costs are introduced
  • Meanwhile, JIT inlining can sometimes eliminate virtual call overhead

Nevertheless, the combination of interfaces and generics remains an important step in decoupling behavior from concrete classes while preserving a single-inheritance object model.

C++ Templates: Code Generation as Abstraction

If Java chose to implement polymorphism through interfaces at runtime, C++ chose a completely different path — moving type checking work to compile time. This approach directly challenges our traditional understanding of "abstraction," providing a completely new perspective for solving the instanceof problem.

C++ templates are a compile-time metaprogramming mechanism that treats types as compile-time values and generates new code for each instantiation.

Function Templates: Compile-time Polymorphism

Let's start with a simple function template:

template <typename T>
T clamp(T value, T min, T max) {
    if (value < min) return min;
    if (value > max) return max;
    return value;
}

// Usage examples
int x = clamp(42, 0, 100);        // T = int
double y = clamp(3.14, 0.0, 1.0); // T = double

This example shows the basic usage of templates, but the true power of C++ templates lies in compile-time computation and type specialization:

// Compile-time computation: Template metaprogramming
template <int N>
struct Factorial {
    static constexpr int value = N * Factorial<N - 1>::value;
};

template <>
struct Factorial<0> {
    static constexpr int value = 1;
};

// Usage: Factorial<5>::value is computed as 120 at compile time

Class Templates: Type Generators

C++ class templates can serve as type generators, which is quite different from Java's generic classes:

template <typename T>
class Vector {
private:
    T* data;
    size_t size;
    size_t capacity;

public:
    Vector() : data(nullptr), size(0), capacity(0) {}

    void push_back(const T& value) {
        if (size >= capacity) {
            resize(capacity == 0 ? 1 : capacity * 2);
        }
        data[size++] = value;
    }

    T& operator[](size_t index) { return data[index]; }
    const T& operator[](size_t index) const { return data[index]; }

    size_t getSize() const { return size; }

private:
    void resize(size_t new_capacity) {
        T* new_data = new T[new_capacity];
        for (size_t i = 0; i < size; ++i) {
            new_data[i] = data[i];
        }
        delete[] data;
        data = new_data;
        capacity = new_capacity;
    }
};

// Different T generates completely different classes
Vector<int> intVector;        // Generates Vector<int>
Vector<std::string> strVector; // Generates Vector<std::string>

Template Specialization: Conditional Behavior

C++ templates support specialization, allowing us to provide special implementations for specific types:

template <typename T>
class StringConverter {
public:
    static std::string toString(const T& value) {
        return std::to_string(value);
    }
};

// Specialization for std::string
template <>
class StringConverter<std::string> {
public:
    static std::string toString(const std::string& value) {
        return value;
    }
};

// Specialization for pointers
template <typename T>
class StringConverter<T*> {
public:
    static std::string toString(T* ptr) {
        return ptr ? "0x" + std::to_string(reinterpret_cast<uintptr_t>(ptr)) : "nullptr";
    }
};

Modern C++: Concepts

C++20 introduced Concepts, making template constraints clearer and more concise:

template <typename T>
concept Numeric = std::is_integral_v<T> || std::is_floating_point_v<T>;

template <Numeric T>
T add(T a, T b) {
    return a + b;
}

// Usage
add(3, 4);        // Correct: int is Numeric
add(3.5, 2.1);    // Correct: double is Numeric
add("hello", "world"); // Error: string is not Numeric

Fundamental Differences from Java Generics

C++ templates and Java generics have essential differences:

  1. Compile-time vs Runtime: C++ templates generate code at compile time, Java generics erase types at runtime
  2. Type preservation: C++ preserves complete type information, Java erases type parameters
  3. Performance: C++ templates pursue "zero-overhead abstraction", but need to be aware of code bloat and compilation time, error message complexity costs. This aligns with longer compilation times and more complex error messages, requiring balance between "goals and costs".
  4. Expressiveness: C++ templates support compile-time computation and metaprogramming, Java generics mainly for type safety

While powerful, C++ templates come at the cost of longer compilation times and more complex error messages. They allow us to perform powerful abstraction and optimization at compile time, which is incomparable to Java generics.

The theme emphasized by templates will repeatedly appear later: Abstraction is not just about object interfaces, but about operating on families of related types.

Java Interfaces vs C++ Concepts: Two Different Abstraction Philosophies

Java's interfaces and C++'s abstraction mechanisms represent two different design philosophies. Let's understand their differences through concrete examples.

Java Interfaces: Explicit Behavioral Contracts

Java's interfaces are a form of explicit behavioral contracts, where all implementations must explicitly declare:

// Java's Comparator interface
public interface Comparator<T> {
    int compare(T o1, T o2);
    boolean equals(Object obj);

    // Default methods (Java 8+)
    default Comparator<T> reversed() {
        return Collections.reverseOrder(this);
    }

    default Comparator<T> thenComparing(Comparator<? super T> other) {
        return (c1, c2) -> {
            int res = compare(c1, c2);
            return (res != 0) ? res : other.compare(c1, c2);
        };
    }
}

// Concrete implementation
class StudentComparator implements Comparator<Student> {
    @Override
    public int compare(Student s1, Student s2) {
        return Integer.compare(s1.getGrade(), s2.getGrade());
    }
}

// Usage
List<Student> students = ...;
students.sort(new StudentComparator());

C++ Concepts: Abstraction Based on Compile-time Constraints

C++ doesn't have a true interface concept, but can achieve similar functionality through abstract base classes and Concepts:

// C++ traditional approach: Abstract base class
template <typename T>
class Comparator {
public:
    virtual ~Comparator() = default;
    virtual int compare(const T& a, const T& b) const = 0;
};

class StudentComparator : public Comparator<Student> {
public:
    int compare(const Student& a, const Student& b) const override {
        return a.getGrade() < b.getGrade() ? -1 :
               a.getGrade() > b.getGrade() ? 1 : 0;
    }
};

// Modern C++: Concepts (C++20)
template <typename T>
concept Comparable = requires(const T& a, const T& b) {
    { a < b } -> std::convertible_to<bool>;
    { a > b } -> std::convertible_to<bool>;
    { a == b } -> std::convertible_to<bool>;
};

// Sorting function based on Concepts
template <typename T, typename Comp>
requires requires(const Comp& comp, const T& a, const T& b) {
    { comp(a, b) } -> std::convertible_to<int>;
}
void sort(T begin, T end, Comp comparator) {
    // Implement sorting logic
}

Practical Comparison: Equality Concept

Let's compare the two languages through a more complex example—the Equality concept:

Java's Equality Abstraction

// Java functional interface
@FunctionalInterface
public interface EqualityChecker<T> {
    boolean areEqual(T a, T b);

    // Composition operations
    default EqualityChecker<T> and(EqualityChecker<? super T> other) {
        return (a, b) -> areEqual(a, b) && other.areEqual(a, b);
    }

    default EqualityChecker<T> or(EqualityChecker<? super T> other) {
        return (a, b) -> areEqual(a, b) || other.areEqual(a, b);
    }

    default EqualityChecker<T> negate() {
        return (a, b) -> !areEqual(a, b);
    }
}

// Usage example
class Person {
    private String name;
    private int age;

    // Static factory methods
    public static EqualityChecker<Person> byName() {
        return (p1, p2) -> p1.name.equals(p2.name);
    }

    public static EqualityChecker<Person> byAge() {
        return (p1, p2) -> p1.age == p2.age;
    }

    public static EqualityChecker<Person> byNameAndAge() {
        return byName().and(byAge());
    }
}

C++'s Equality Abstraction

// C++ traditional approach: Function objects
template <typename T>
struct EqualityChecker {
    virtual bool operator()(const T& a, const T& b) const = 0;
    virtual ~EqualityChecker() = default;
};

// Concrete implementation
struct PersonNameEquality : EqualityChecker<Person> {
    bool operator()(const Person& a, const Person& b) const override {
        return a.getName() == b.getName();
    }
};

// Modern C++: Lambda and Concepts
template <typename T>
concept EqualityCheckable = requires(const T& a, const T& b) {
    { a == b } -> std::convertible_to<bool>;
};

// Implementation of composition operations
template <typename T, typename F1, typename F2>
class AndEquality : public EqualityChecker<T> {
    F1 f1;
    F2 f2;
public:
    AndEquality(F1 f1, F2 f2) : f1(f1), f2(f2) {}

    bool operator()(const T& a, const T& b) const override {
        return f1(a, b) && f2(a, b);
    }
};

// Factory functions
template <typename T, typename F1, typename F2>
auto make_and_equality(F1 f1, F2 f2) {
    return AndEquality<T, F1, F2>(f1, f2);
}

// Modern approach using lambdas
auto personByNameEquality = [](const Person& a, const Person& b) {
    return a.getName() == b.getName();
};

auto personByAgeEquality = [](const Person& a, const Person& b) {
    return a.getAge() == b.getAge();
};

auto personByNameAndAge = make_and_equality<Person>(
    personByNameEquality, personByAgeEquality
);

Core Differences Comparison

  1. Type System Differences:

    • Java: Runtime type erasure, polymorphism based on inheritance
    • C++: Compile-time type preservation, polymorphism based on templates
  2. Memory and Performance:

    • Java: Virtual function calls, runtime overhead
    • C++: Template instantiation, compile-time optimization, zero runtime overhead
  3. Flexibility:

    • Java: Single inheritance of interfaces, but supports default methods
    • C++: Multiple inheritance, template specialization, Concepts constraints
  4. Error Handling:

    • Java: Compile-time checks + runtime exceptions
    • C++: Mainly compile-time errors (complex template error messages)
  5. Learning Curve:

    • Java: Simple and intuitive, easy to get started
    • C++: Complex concepts, need deep understanding of template metaprogramming

Practical Application Recommendations

Choose Java interfaces when:

  • Need runtime polymorphism
  • Team skill levels vary
  • Need simple and explicit contracts
  • Dependency injection and framework integration

Choose C++ abstraction when:

  • Performance requirements are extremely high
  • Need compile-time optimization
  • Complex type operations and metaprogramming
  • Need zero-cost abstraction

These two different abstraction philosophies each have their pros and cons, and the choice depends on specific application scenarios and team needs.

Kotlin and Modern Java's Sealed Hierarchies: Constraining Extension

As object-oriented languages matured, developers wanted compilers to understand when a type hierarchy was "complete." This led to Kotlin's sealed classes and Java 17+'s sealed interfaces. Sealed hierarchies declare that only a fixed set of subclasses can implement the contract, usually within the same compilation unit:

Kotlin
import java.math.BigDecimal

sealed interface PaymentCommand {
    val amount: BigDecimal
}

data class Charge(
    override val amount: BigDecimal,
    val cardToken: String
) : PaymentCommand

object RefundAll : PaymentCommand {
    override val amount = BigDecimal.ZERO
}

fun PaymentCommand.describe(): String = when (this) {
    is Charge -> "Charge ${amount} to card ${cardToken}"
    RefundAll -> "Refund all remaining balance"
}

// Usage example
fun processCommands(commands: List<PaymentCommand>) {
    commands.forEach { command ->
        println(command.describe())
    }
}

Through sealing, Kotlin (and modern Java) can provide exhaustive when/switch checking. Since the compiler knows all subtypes, no else branch is needed. The sealing mechanism therefore strikes a balance between open interface design and the closed-world guarantees we'll see later in algebraic data types.

TypeScript: A Self-Contained Structural Type System

TypeScript was born from JavaScript's dynamic ecosystem. It embraces structural typing, where compatibility depends on the shape of objects rather than their declared names. Let's explore TypeScript's powerful abstraction capabilities through practical examples.

Structural Typing and Interfaces

TypeScript's structural type system makes "duck typing" type-safe:

// Define shapes, not caring about concrete types
interface Point2D {
  x: number;
  y: number;
}

interface Point3D {
  x: number;
  y: number;
  z: number;
}

// Any object with x and y properties will work
function distance(p1: Point2D, p2: Point2D): number {
  return Math.sqrt((p1.x - p2.x) ** 2 + (p1.y - p2.y) ** 2);
}

// Structural typing makes Point3D automatically compatible with Point2D
const point3d: Point3D = { x: 1, y: 2, z: 3 };
const point2d: Point2D = { x: 4, y: 5 };

console.log(distance(point3d, point2d)); // Completely legal!

Discriminated Unions: TypeScript's ADTs

TypeScript's discriminated union types provide capabilities similar to algebraic data types:

// Discriminated union for payment commands
type PaymentCommand =
  | { kind: "charge"; amount: number; cardToken: string }
  | { kind: "refund"; transactionId: string; amount: number }
  | { kind: "query"; transactionId: string };

// Type-safe handling with exhaustiveness checking
function processPayment(command: PaymentCommand): string {
  switch (command.kind) {
    case "charge":
      return `Charging $${command.amount} to card ${command.cardToken}`;
    case "refund":
      return `Refunding $${command.amount} for transaction ${command.transactionId}`;
    case "query":
      return `Querying status for transaction ${command.transactionId}`;
    default:
      // TypeScript ensures this can never be reached
      const _exhaustiveCheck: never = command;
      return _exhaustiveCheck;
  }
}

// Helper function for exhaustiveness checking
function assertNever(value: never): never {
  throw new Error(`Unexpected object: ${value}`);
}

// Usage in more complex scenarios
function handlePaymentCommand(command: PaymentCommand): string {
  switch (command.kind) {
    case "charge":
      return `Charging $${command.amount} to card ${command.cardToken}`;
    case "refund":
      return `Refunding $${command.amount} for transaction ${command.transactionId}`;
    case "query":
      return `Querying status for transaction ${command.transactionId}`;
    default:
      // Use assertNever for better error messages
      return assertNever(command);
  }

Mapped Types and Conditional Types

TypeScript's type manipulation capabilities make it a true "type calculator":

// Mapped types: Create new types based on existing ones
type Optional<T> = {
  [P in keyof T]?: T[P];
};

type ReadOnly<T> = {
  readonly [P in keyof T]: T[P];
};

interface User {
  id: number;
  name: string;
  email: string;
}

type OptionalUser = Optional<User>; // All properties become optional
type ReadOnlyUser = ReadOnly<User>; // All properties become readonly

Control Flow Analysis and Type Narrowing

TypeScript's intelligent type narrowing makes runtime checks safer:

// Type guard functions
function isString(value: unknown): value is string {
  return typeof value === "string";
}

function processValue(value: unknown) {
  if (isString(value)) {
    // TypeScript knows this is string type
    console.log(value.toUpperCase());
  }

  // typeof narrowing
  if (typeof value === "number") {
    console.log(value.toFixed(2));
  }

  // Instance narrowing
  if (value instanceof Date) {
    console.log(value.toISOString());
  }
}

// Literal type narrowing
type Status = "pending" | "processing" | "completed" | "failed";

function updateStatus(status: Status) {
  if (status === "completed") {
    // Only "completed" can enter this branch
    console.log("Task completed!");
  }
}

Generics and Constraints

TypeScript's generic system provides powerful abstraction capabilities:

// Generics with constraints
interface Identifiable {
  id: string;
}

function findById<T extends Identifiable>(
  items: T[],
  id: string
): T | undefined {
  return items.find(item => item.id === id);
}

// Generic defaults
interface Repository<T = any> {
  findById(id: string): Promise<T>;
  save(entity: T): Promise<void>;
  delete(id: string): Promise<boolean>;
}

// Keyof generics
function getProperty<T, K extends keyof T>(obj: T, key: K): T[K] {
  return obj[key];
}

const user = { id: "1", name: "Zhang San", age: 25 };
const userName = getProperty(user, "name"); // Type is string
const userAge = getProperty(user, "age");   // Type is number

Thanks to the language server-integrated type system, TypeScript builds a self-contained development experience: the compiler, toolchain, and ecosystem conventions reinforce consistent data modeling even in codebases that mix JavaScript, server-side frameworks, and UI libraries.

Haskell: Algebraic Data Types and Type Classes

The functional tradition pushes data abstraction to its logical conclusion — Algebraic Data Types (ADTs) and Type Classes. Let's deeply explore Haskell's pure abstraction approach.

Algebraic Data Types: Algebraic Expression of Data

Haskell's ADTs declaratively combine products ("AND") and sums ("OR"):

-- Define basic types
type Money = Double
type Text = String

-- Payment commands: Sum type (OR relationship)
data PaymentCommand
  = Charge { amount :: Money, cardToken :: Text }
  | Refund { transactionId :: Text, amount :: Money }
  | Query { transactionId :: Text }
  | RefundAll
  deriving (Show, Eq)

-- User: Product type (AND relationship)
data User = User
  { userId      :: Int
  , userName    :: Text
  , userEmail   :: Text
  , userAge     :: Int
  , isActive    :: Bool
  } deriving (Show, Eq)

-- Recursive data types: Binary tree
data BinaryTree a
  = Leaf
  | Node (BinaryTree a) a (BinaryTree a)
  deriving (Show, Eq, Functor)

Pattern Matching: Exhaustiveness Guarantees

Haskell's pattern matching requires exhaustiveness by default, letting the compiler help you check all possible cases:

-- Handle payment commands, compiler ensures all cases are handled
processPayment :: PaymentCommand -> Text
processPayment (Charge amount cardToken) =
  "Charging " ++ show amount ++ " to card " ++ cardToken
processPayment (Refund transactionId amount) =
  "Refunding " ++ show amount ++ " for transaction " ++ transactionId
processPayment (Query transactionId) =
  "Querying status for transaction " ++ transactionId
processPayment RefundAll =
  "Refund all remaining balance"

-- Use pattern matching to handle recursive data types
treeSum :: Num a => BinaryTree a -> a
treeSum Leaf = 0
treeSum (Node left value right) = treeSum left + value + treeSum right

-- Complex pattern matching
validateCommand :: PaymentCommand -> Either Text PaymentCommand
validateCommand cmd@(Charge amount _)
  | amount <= 0 = Left "Charge amount must be positive"
  | otherwise   = Right cmd
validateCommand cmd@(Refund _ amount)
  | amount < 0  = Left "Refund amount cannot be negative"
  | otherwise   = Right cmd
validateCommand cmd = Right cmd

Type Classes: Abstract Interfaces for Behavior

Haskell's type classes capture behavior that applies to multiple types while keeping implementations independent:

-- Basic type class: Comparable
class Comparable a where
  compare :: a -> a -> Ordering
  (<), (>), (<=), (>=) :: a -> a -> Bool

  -- Default implementations
  x < y  = compare x y == LT
  x > y  = compare x y == GT
  x <= y = compare x y /= GT
  x >= y = compare x y /= LT

-- Instantiate for Int
instance Comparable Int where
  compare x y | x == y    = EQ
              | x < y     = LT
              | otherwise = GT

-- Semigroup: Associative operations
class Semigroup a where
  (<>) :: a -> a -> a

-- Monoid: Semigroup with identity element
class Semigroup a => Monoid a where
  mempty :: a
  mappend :: a -> a -> a
  mappend = (<>)

-- List monoid instance
instance Semigroup [a] where
  (<>) = (++)

instance Monoid [a] where
  mempty = []

-- Multiplicative monoid for numbers
newtype Product a = Product { getProduct :: a }
  deriving (Show, Eq)

instance Num a => Semigroup (Product a) where
  (Product x) <> (Product y) = Product (x * y)

instance Num a => Monoid (Product a) where
  mempty = Product 1

Practical Application: Simple Validation Example

-- Simple validation type class
class Validatable a where
  validate :: a -> Either Text a

-- Add validation for PaymentCommand
instance Validatable PaymentCommand where
  validate (Charge amount _)
    | amount <= 0 = Left "Charge amount must be positive"
    | otherwise   = Right (Charge amount _)
  validate (Refund _ amount)
    | amount < 0  = Left "Refund amount cannot be negative"
    | otherwise   = Right (Refund _ amount)
  validate cmd = Right cmd

-- Use validation
processValidatedCommand :: PaymentCommand -> Either Text Text
processValidatedCommand cmd = do
  validatedCmd <- validate cmd
  return $ processPayment validatedCmd

Haskell's type classes implement ad-hoc polymorphism while preserving type inference and efficient compilation. With the support of higher-kinded types, they can express complex abstractions like Applicative, Lens, etc., which are often verbose or unsafe in languages with weaker type systems. ADTs make illegal states unrepresentable from the construction level. This "make illegal states unrepresentable" design philosophy is the core advantage of Haskell's type system.

Deep Comparison: Java Interfaces vs Haskell Algebraic Data Types

To more deeply understand the power of Haskell's type system, let's compare Java's interface approach with Haskell's ADT approach through a concrete example.

Java's Comparator Interface: Runtime Polymorphism

Java's Comparator interface is a typical example of runtime polymorphism:

// Java Comparator: Requires explicit interface implementation
public interface Comparator<T> {
    int compare(T a, T b);
}

// Concrete implementation classes
class StudentGradeComparator implements Comparator<Student> {
    @Override
    public int compare(Student a, Student b) {
        return Integer.compare(a.getGrade(), b.getGrade());
    }
}

class StudentNameComparator implements Comparator<Student> {
    @Override
    public int compare(Student a, Student b) {
        return a.getName().compareTo(b.getName());
    }
}

// Usage: Runtime dynamic selection
Comparator<Student> comparator = getComparatorFromUser();
students.sort(comparator);

Characteristics of this approach:

  • Runtime polymorphism: Specific comparison logic is determined at runtime
  • Explicit implementation: Each comparator needs to explicitly implement the Comparator interface
  • Open extension: Anyone can create new comparators
  • Type erasure: Generic information is lost at runtime

Haskell's Ord Type Class: Compile-time Polymorphism

Haskell achieves similar functionality through type classes, but behavior is determined at compile time:

-- Haskell Ord type class: Compile-time derivation
class Ord a where
    compare :: a -> a -> Ordering
    (<), (<=), (>), (>=) :: a -> a -> Bool

    -- Default implementations
    x < y = compare x y == LT
    x <= y = compare x y /= GT
    x > y = compare x y == GT
    x >= y = compare x y /= LT

-- Instantiate for Student type
data Student = Student
    { name :: String
    , grade :: Int
    } deriving (Show, Eq)

instance Ord Student where
    compare (Student n1 g1) (Student n2 g2)
        | g1 /= g2  = compare g1 g2
        | otherwise = compare n1 n2

-- Usage: Comparison behavior determined at compile time
sortStudents :: [Student] -> [Student]
sortStudents = sort  -- sort automatically uses Ord instance

Core Differences Comparison

  1. Polymorphism Mechanism:

    • Java: Runtime dispatch - Dynamic lookup through virtual function table
    • Haskell: Compile-time specialization - Generates specialized comparison functions for each type
  2. Type Safety:

    • Java: Runtime checking - May throw ClassCastException
    • Haskell: Compile-time guarantees - Compilation fails if type has no Ord instance
  3. Performance Characteristics:

    • Java: Virtual function call overhead + Possible boxing/unboxing
    • Haskell: Static calls + Near zero-overhead under specialization
  4. Expressiveness:

    • Java: Interface constraints - Can only define method signatures
    • Haskell: Type class constraints - Can have default implementations and associated types

More Complex Example: Equality vs Eq

Let's look at a more complex example showing the true power of Haskell's type system:

Java's Equality Checking

// Java: Need multiple interfaces and implementations
public interface EqualityChecker<T> {
    boolean areEqual(T a, T b);
}

public interface HashProvider<T> {
    int hashCode(T obj);
}

// Composite interface
public interface HashableEquality<T> extends EqualityChecker<T>, HashProvider<T> {}

// Concrete implementation
public class PersonHashableEquality implements HashableEquality<Person> {
    @Override
    public boolean areEqual(Person a, Person b) {
        return a.getName().equals(b.getName()) &&
               a.getAge() == b.getAge();
    }

    @Override
    public int hashCode(Person obj) {
        return Objects.hash(obj.getName(), obj.getAge());
    }
}

Haskell's Eq Type Class

-- Haskell: One type class solves all problems
class Eq a where
    (==) :: a -> a -> Bool
    (/=) :: a -> a -> Bool

    -- Default implementations
    x == y = not (x /= y)
    x /= y = not (x == y)

-- Automatically derive Eq instance
data Person = Person
    { name :: String
    , age  :: Int
    } deriving (Show, Eq)

-- Haskell can automatically generate:
-- 1. Equality comparison for two Persons
-- 2. HashCode implementation based on fields
-- 3. Compile-time type checking

Costs and Benefits of Haskell's Type System

Benefits

  1. Compile-time Safety:

    -- Compilation fails directly if type has no Eq instance
    findDuplicates :: Eq a => [a] -> [a]
    findDuplicates xs = [x | x <- xs, count x xs > 1]
    
    -- Error: No instance for (Eq SomeType)
    findDuplicates [someType1, someType2]
  2. Automated Type Deduction:

    -- Compiler automatically derives need for Ord constraint
    sortStudents :: [Student] -> [Student]
    sortStudents = sort  -- sort :: Ord a => [a] -> [a]
  3. Near Zero-Overhead Under Specialization:

    -- With SPECIALISE pragma, compiles to equivalent hand-optimized code
    {-# SPECIALISE instance Ord Student #-}
    instance Ord Student where
        compare = compareStudents  -- Inline optimization
  4. Illegal States Unrepresentable:

    -- Prevent illegal states at compile time
    data PaymentStatus = Pending | Processing | Completed
    -- Cannot create states other than these three

Costs

  1. Steep Learning Curve:

    • Need to understand type classes, instances, constraints, etc.
    • Type deduction mechanism is complex
    • Error messages are abstract and hard to understand
  2. Long Compilation Times:

    • Complex type deduction requires extensive computation
    • Compile-time optimizations increase compilation time
  3. Reduced Runtime Flexibility:

    • Dynamic behavior is limited
    • Limited runtime reflection capabilities
  4. Ecosystem Limitations:

    • Relatively fewer libraries
    • Difficult integration with mainstream OO frameworks

Practical Significance

Haskell's type system demonstrates the ultimate pursuit of programming language design: moving as many errors as possible to compile time. This design philosophy means in practice:

  • Reduced testing burden: Compiler has already eliminated entire categories of errors
  • Increased refactoring confidence: Type system guarantees safety of modifications
  • Documentation as code: Type signatures are the most accurate documentation
  • Performance optimization: Compile-time information supports deep optimization

However, this powerful capability comes at a cost. Haskell is not suitable for all scenarios, especially projects needing rapid iteration, runtime flexibility, or diverse team skills. Understanding this trade-off is key to choosing the right technology stack.

Just like our journey starting from instanceof, each abstraction approach is answering the same question: How to ensure correctness while maintaining flexibility? Haskell provides an extreme but elegant answer, and the value of this answer depends on your specific needs and constraints.

Cross-Comparison

Feature / LanguageJava Interfaces & GenericsC++ TemplatesKotlin/Java Sealed TypesTypeScript Structural Type SystemHaskell ADTs & Type Classes
Core AbstractionBehavioral contracts on nominal typesCompile-time code generation over typesClosed hierarchies with exhaustive analysisStructural typing with unions, inference, and toolingAlgebraic composition of data + behavior
ExtensibilityOpen world; any class can implementOpen world; instantiations everywhereClosed to declared subclassesOpen; compatibility by shapeExtensible via new data constructors, but limited overall
Runtime RepresentationErased generics, single dispatchSpecialized code per instantiationJVM classes with metadata for sealed hierarchyErased at runtime but guided by control-flow narrowingRich compile-time info, erased to efficient core language
Safety GuaranteesBehavioral contracts; limited exhaustDepends on constraints; can be unsafeExhaustive when/switch over known variantsNarrowing and unions catch many runtime errorsExhaustive pattern matching; illegal states impossible
Tooling ExperienceMature IDE supportPowerful but complex compilersKotlin/Java IDEs enforce sealing rulesLanguage server explains inferred shapes and discriminated unionsCompiler + REPL ensure laws and instances

This table shows that no approach dominates universally. Each step in evolution just adds new choices to the developer's toolbox.

From instanceof to Abstraction: Complete Mindset Shift

Returning to our original problem: How to avoid using instanceof? Through this exploration, we discover this is not just a technical problem, but a mindset shift.

Problem-Solving Evolution Path

  1. Identify the problem: Incompleteness and runtime risks brought by instanceof
  2. Understand the essence: Need to convert runtime type judgments into compile-time type guarantees
  3. Choose tools: Select appropriate abstraction mechanisms based on language features and scenarios
  4. Implement solutions:
    • Java/Kotlin: Use interfaces, sealed classes, pattern matching
    • C++: Utilize templates, Concepts, compile-time polymorphism
    • TypeScript: Adopt structural types, discriminated unions, type narrowing
    • Haskell: Through ADTs, type classes, pattern matching

Practical Application Recommendations

For Beginners:

  • Start with Java interfaces, understand basic concepts of behavioral abstraction
  • Gradually explore sealed classes and pattern matching, experience the power of compile-time checking
  • Try TypeScript's structural types, experience the type-safe version of "duck typing"

For Experienced Developers:

  • In multi-language projects, understand the mapping relationships between different abstraction styles
  • Choose appropriate technology stacks based on performance needs, team skills, ecosystem maturity
  • Establish unified abstract thinking, not limited by specific syntax

For System Designers:

  • Consider the transmission of type safety at the architectural level
  • Use strong type systems to build reliable distributed systems
  • Find balance between flexibility and security

Conclusion: At the Intersection of Mathematics, Language, and Computing

As I sit at my desk, thinking about how to conclude this article about data type abstraction, I find myself at an interesting intersection—here there's the rigor of computer science, the elegance of pure mathematics, and the charm of linguistics. This is precisely where my beloved interests converge.

The Computer Science Perspective: Hierarchy of Abstraction

From a computer science perspective, the abstractions we discussed today are essentially a hierarchical problem. From machine code to assembly, from procedural programming to object-oriented programming, to functional programming, each step wraps lower-level complexity into higher-level abstractions.

Physical layer → Logic gates → Instruction set → Assembly language → High-level languages → Abstract design patterns

Type systems are an important part of this abstraction pyramid. They let us discover errors at compile time rather than waiting for programs to crash in front of users. This philosophy of "preventive programming" is precisely the reflection of computer science evolving from engineering practice to scientific theory.

The Pure Mathematics Perspective: Power of Formal Systems

As a mathematics enthusiast, I'm often amazed by the striking similarity between type systems and formal systems. Haskell's type classes remind me of groups, rings, and fields in algebraic structures; the "sum" and "product" of algebraic data types remind me of unions and Cartesian products in set theory; and type deduction is like theorem proving in logical systems.

-- This isn't just code, this is mathematics!
data PaymentCommand = Charge ... | Refund ... | Query ...
-- This is a sum type, corresponding to disjoint union in mathematics

The Curry-Howard isomorphism tells us: Programs are proofs, types are propositions. When we write a type-correct function, we're not just writing executable code, but constructing a mathematical proof. This idea deeply influences modern programming language design.

The Linguistics Perspective: Precise Expression of Meaning

Linguistics teaches us to focus on precision of expression. The ambiguity of natural language is often the root of errors in programming. When we say "an object," what exactly are we referring to? Is it a concrete instance, or an abstract concept?

Type systems are like a precise formal language that forces us to think clearly before coding:

  • What does this data represent?
  • What are its possible values?
  • What operations can be performed on it?

This precision training makes me pay more attention to conceptual accuracy and logical rigor when reading and writing.

Cross-Disciplinary Unity: The Essence of Abstraction

In these three fields, I see the common essence of abstraction:

In mathematics, we abstract specific numerical relationships through axioms and definitions, thereby proving universally applicable theorems.

In linguistics, we abstract specific expressions through grammatical rules, thereby constructing meaningful communication.

In computer science, we abstract specific data operations through type systems, thereby building reliable software systems.

They are all answering the same fundamental question: How to express infinite possibilities with finite rules?

Personal Programming Philosophy

It's because of these cross-disciplinary interests that I've gradually formed my own programming philosophy:

1. Mathematics is the foundation of programming: Understanding the mathematical foundations of type systems helps us better choose and use abstraction tools. When we know an interface is actually defining an algebraic structure, our designs become clearer and more purposeful.

2. Language is a tool for thinking: Choosing what programming language to use is not just a technical issue, but a choice of thinking style. Different languages shape different thinking patterns, just as natural languages influence our cognition of the world.

3. Abstraction is a bridge, not a destination: We learn various abstraction techniques ultimately to solve practical problems. Abstraction shouldn't distance us from problems, but should bring us closer to their essence.

Thoughts on the Future

Standing at this intersection, I see many interesting directions:

  • Dependent types: Integrating mathematical proofs directly into programming languages, allowing program correctness to be formally verified at compile time
  • Natural language processing: Applying type system thinking to natural language understanding and generation, building more precise human-computer interaction

Final Thoughts

Returning to our original question: How to avoid instanceof?

Now I discover this question is far more profound than the technology itself. It involves our understanding of complexity management, pursuit of formalized expression, and training in precise thinking.

Every time we choose interfaces over type checking, ADTs over conditional branches, type classes over runtime reflection, we're conducting a small philosophical practice—we believe that structured thinking can overcome chaos, precise expression can overcome ambiguity.

Perhaps this is what attracts me most about programming: it's both science and art, needing both rigorous logic and creative inspiration, connecting both mathematical purity and serving real-world needs.

I hope that in this article, you've not only learned technical knowledge but also felt the charm of this cross-disciplinary thinking. Because ultimately, the best code is not just correct, but elegant; not just functional, but expressive.