Dependent types, refinement types, gradual types, rank-n polymorphism, etc, – we are moving in the right direction
Let’s dream a bit and try to imagine what can be improved about type systems. Ability to limit and check variables and function arguments at compile time multiplies development speeds by dozens of times, but modern type systems are STILL not expressive enough - you mostly can filter out only quite basic errors. Functional programmers excel at making the wrong state inexpressible in terms of the type system, but static type systems still can’t encode very complex logic and relations.
Well, they do, but it is absolutely unusable for casual coding, fast prototyping, and most industrial use cases.
In certain fields, such as game development, the amount of rules gets crazy, and most interactions of related game mechanics can’t even be covered with tests, not to mention types. But possibly this is only because type systems are too restrictive. They don’t allow embedding crazy expressions from the the value level. Yet.
Possible Improvements For Type Systems
- Types should be first-class objects. There should be convenient type lambdas
- Types should be allowed to contain value-level expressions (Refinement Types)
- Moreover, types could contain runtime values (Gradual Types)
- The same code should be usable both at compile time and execution time
- There should be runtime conditional expressions for type checks
- All definitions and type specifications should share context with the code they’re in
- Type intersection is a fundamental operation for composing types
- BONUS: Types, sets, and functions should be unified with T̷y̷p̴e̷ ̵Fi̵l̷t̸e̵r̸s̴
- BONUS: I̸n̵v̶e̵r̶s̶e̷ f̵un̸c̴t̵i̴o̷n̴s̸ ̵s̵h̵o̸u̸l̸d̶ ̶e̷a̴s̶e̶ work with ̷a̴l̷l t̵y̷p̵e̸ ̴s̴p̵e̴c̵i̸f̵i̴c̵a̵t̷i̸o̵n̴s (next post)
- BONUS: Type/value union is another fundamental operation for representing indeterminism
What in the name of satan is all of that? Possibly, I’ll explain everything in the next blog posts, but this time, let’s just review some examples.
Notation
I’ll use pseudocode in some semi-esoteric language for clarity. It’s very expressive, so if you want something more conventional, you can change to pseudo-C++ or pseudo-Haskell in every code example - see tabs above code blocks. But I suggest to glimpse through all three.
In this section, I’ll explain shortly how to read the ideas described in this post.
Firstly, this is how type intersection is done. This would be a fundamental operation for composing types.
myVariable: MyType1
MyType2 & MyType3
MyIntersectionType = MyType1 & MyType2
MyType3
MyStruct = has myField: Integer
has myField2: String
// Values of type MyStruct are limited to those which have
// both `myField` and `myField2` fields
intersect<MyType1, MyType2> myVariable;
using MyStruct = intersect<struct { int x; },
struct { int y; }>
// Values of type MyStruct are limited to those which have
// both `myField` and `myField2` fields (struct { int x, y; })
type MyIntersectionType = MyType1 & MyType2
type MyStruct = { x :: Integer }
& { y :: Bool }
-- Values of type MyStruct are limited to those which have
-- both `myField` and `myField2` fields (MyStruct { x :: Integer, y :: Bool })
In our magical language, type-level expressions could contain runtime values. This could be both constructed types and subtractive types that would filter out values of types they’re intersecting with. For example, here we use “myNaturalVariable” in inline type definition:
myNaturalVariable: Integer
(>=0)
myVariableWithCondition: {1, 2, 3, myNaturalVariable}
if myNaturalVariable > 3
auto positive = [](int x){ return x >= 0; };
using MyNaturalType = intersect<int, refine(positive)>;
MyNaturalType myNaturalVariable;
using MyTypeWithCondition = intersect<{1, 2, 3, myNaturalVariable},
condition(myNaturalVariable > 3)>;
type MyNaturalType = Integer & refine (>=0)
myNaturalVariable :: MyNaturalType
myNaturalVariable = 4
type MyTypeWithCondition = [ 1, 2, 3, myNaturalVariable ]
& { myNaturalVariable > 3}
Unique to this theoretical language. Types should not only be on the same level as variables, but also they should act as functions, collections, and as true predicates. All these concepts can be unified with one syntax for a concept of “type filter”.
In the examples for C++ and Haskell, I use refine
function/keyword, which creates a refinement type out of function - otherwise, if we write just functions, it won’t look as clear. Also, C++ doesn’t allow type annotation of any value, so I can’t apply “type filter” idea to it.
3: Int
print (Int 3)
"123": Boolean // error
4: >0 // partially applied "greater than zero" function
// acts as a type
(=2) 2 = 2 : (=2) // hellish mashup
1: (x->1) // lambda as set/type
myList = [1, 2, 3, 4, 5]
// with this syntax, there are some cool ways to say `map (1+) myList`:
print (x:myList -> x+1) // [2, 3, 4. 5, 6]
print {x:myList + 1} // [2, 3, 4. 5, 6]
auto positive = [](int x){ return x > 0; };
refine(positive) x; // partially applied "greater than zero" function
// acts as a type
x :: Integer & refine (>0)
"123" :: Boolean -- error
4 :: refine (>0) -- partially applied greater than zero function
-- acts as a type
Dependent Types With Runtime Values
This is not possible anywhere now (is it?), but I’m sure after 30 more years of research, we’ll be able to do this in some experimental Haskell extension or language. Runtime values can be used in static types and erased after compilation, leaving dynamic checks only on entry points. Here is a simple example of what this could look like:
Dog = has isAwake: Boolean
bark() : String // type intersection takes place
if isAwake // String & (if not isAwake)
= "woof"
dog: Dog
dog.isAwake <- read Boolean
print dog.bark() // error: can't apply `dog.bark`,
// `dog.isAwake` could be false
if dog.isAwake // theoretical "if" intersects types with
// information from the condition
print dog.bark() // woof
struct Dog {
bool isAwake;
auto bark() -> intersect<string, condition(isAwake)> {
return "woof";
}
};
...
Dog dog;
std::cin >> dog.isAwake;
std::cout << dog.bark(); // error: can't apply `Dog::bark`,
// `dog.isAwake` could be false
if auto (dog.isAwake) { // theoretical "if auto" intersects types with
// information from the condition
// (of course, I had to invent another use for auto)
std::cout << dog.bark(); // woof
}
data Dog = Dog { isAwake :: Boolean }
bark :: { this :: Dog } -> String & if isAwake this
bark this = "woof"
main :: IO ()
main = do
isDogAwake <- getLine
let dog = Dog { isAwake = (read isDogAwake :: Boolean) }
print $ bark dog -- error: can't apply `bark`,
-- `dog.isAwake` could be false
if' isAwake dog then -- theoretical "if'" intersects types
-- with information from the condition
print $ bark dog -- woof
My dog actually can bark while sleeping, but in the example above, you won’t be able to call bark
function without guaranteeing that isAwake
is false.
Here is one more example of the same concept:
Dog = has maxFood = 10
food: >= 0
<= maxFood
isHungry = food != maxFood
eat newFood : where newFood + food < maxFood
if isHungry
= food <- food + newFood
dog: Dog
dog.food <- 2
dog.eat (read Integer) // error: argument newFood doesn't satisfy `newFood + food < maxFood` and `isHungry`
read Integer : clamp 0 5 : dog.eat
struct Dog {
int maxFood = 10;
intersect<int,
refine(greaterEqualThan(0)),
refine(lessEqualThan(maxFood))
> food;
bool isHungry() {
return food != maxFood;
}
auto eat(int newFood) -> intersect<void,
condition(newFood + food < maxFood),
condition(isHungry())
>
{
food += newFood;
}
};
...
Dog dog;
dog.food = 2;
int newFood;
std::cin >> newFood;
dog.eat(newFood); // error: argument newFood doesn't satisfy
//`newFood + food < maxFood` and `isHungry`
dog.eat(clamp(0, newFood, 5)); // no error
type Dog = { maxFood :: Int, food :: Int & refine (<=maxFood) }
isHungry dog = food dog != maxFood
eat :: { this :: Dog }
-> { newFood :: Int }
-> Dog & if isHungry & if food this + newFood < maxFood this
= ...
...
main = do
foodToEatInput <- getLine
let foodToEat = read foodToEatInput :: Int
let dog = Dog { food = 2, maxFood = 10 }
print $ eat dog footToEat -- error: argument newFood doesn't satisfy:
-- - food this + newFodd < maxFood this
-- - isHungry
let foodToEatLimited = clamp (0, 5) newFood
print $ eat dog footToEat -- no error
Laws
It can be implementation-dependent whether the value is actually stored in memory, or is it inferred at compile time like an evidence and discarded immediately. Perhaps things like implication and abstract value could be cleaner than a typed variable.
Dog = has
...
food < maxFood => isHungry
...
dog: Dog
dog.food <- read (0 <= _ <= 10)
dog.eat 1 // error: couldn't satisfy `isHungry` (`food <= maxFood`)
class Dog {
...
implication(food < maxFood, isHungry);
...
};
Dog dog;
std::cin >> dog.food;
dog.eat(1); // error: couldn't satisfy `isHungry` (`food <= maxFood`)
type Dog = { maxFood :: Int, food :: Int & refine (<=maxFood) } isHungry dog = food < maxFood eat :: { isHungry dog } => { dog :: Dog } -> Dog
main = do
foodInput <- getLine
let dog = Dog { food = (read foodInput :: Int), maxFood = 10 }
print $ eat dog 1 – error: couldn’t satisfy isHungry
(food <= maxFood
)
There is enough information for isHungry
flag would be completely erased on runtime, even though no values are known at compile time: all checks are performed outside on input, conversions, and manual tests!
By the way, this actually resembles logic programming on Prolog and similar systems.
Complex Relations Expressed With Type System
Static typing in popular languages is very good for letting you know about typos, misplaced arguments, and other minor mess-ups. But it is usually too concrete to give a programmer the ability to express higher concepts and rule out possibilities of behavior-related bugs. While proposed ideas obviously can’t solve all of them, it’s certainly useful to let users describe more details of the system using the tools in the language.
Case Study: Game UI Bugs
Let’s suppose that in some game, the player is suggested to select one or more goods at the same time and click upgrade button to buy selected goods for his gold. There are many kinds of bugs for UI to protect from, such as
- buying 0 goods
- spending all gold in some other window and then buying something in this window because it wasn’t updated
- somehow buying things twice at the same time
- buying things for less gold than required
These errors need to be explicitly warned in UI - that means that every bug has to be mentally (and sometimes physically) procesed multiple times. It would be good to remove this burden from an UI implementor. He would be warned of each bug and he will be forced to handle each case that is described right in the type system.
GoodInShop = has item: InventoryItem
cost: Cost
Shop = has
...
player: Player
buy goods = ...
where goods: [GoodInShop]
this.player.gold >= fold (+) (x:goods -> x.cost)
not goods.isEmpty
struct GoodInShop {
InventoryItem item;
Cost cost;
};
class Shop {
...
using GoodsToBuy = intersect(
const std::vector<GoodInShop>&, // wrong
refine([](const std::vector<GoodInShop>& goods){ return !goods.empty(); }),
refine([this](const std::vector<GoodInShop>& goods){
return player->gold() >=
std::ranges::accumulate(goods, 0 [](const auto& x, const auto&y){
return x.cost + y.cost;}
);
})
);
void buy(GoodsToBuy goods);
};
data GoodInShop = GoodInShop { inventoryItem :: InventoryItem, cost :: Cost }
type BuyOrder = (Player, [GoodInShop])
& refine ((!=[]) . snd)
& refine (\(player, goods) -> foldl (+) 0 x > gold player)
buy :: PlayerId -> GoodsToBuy -> Shop ()
How a game designer could try to use it at first in UI layer:
ShopWindowWidget.buyButton.onClick = do
goodToBuy: this.goodsBox.children
GoodInShop
goodToBuy.Selected = true
this.shop.buy {goodToBuy} // error:
...
this->buyButton->onClick([this](){
std::vector<GoodInShop> goodsToBuy;
for (auto& child: this->goodsBox) {
if (auto goodToBuy = Cast<GoodInShop>(child)) {
if (goodToBuy.Selected)
goodsToBuy.push_back(goodToBuy.good)
}
}
this->shop->buy(goodsToBuy); // error:
});
...
...
button <- makeButton "BuyButton" $ do
goodsToBuy <- getSelectedGoods
buy shop goodsToBuy // error:
...
getSelectedGoods :: GUI [GoodInShop]
getSelectedGoods = do
...
Type system immediately will stop him, noting that goodsToBuy
can be empty and that he didn’t handle the case when there is not enough gold. User can then probably handle these cases like that
ShopWindowWidget.buyButton.onClick = do
goodToBuy: this.goodsBox.children
GoodInShop
goodToBuy.Selected = true
if isEmpty goodToBuy
this.displayNoGoodsSelected()
else if this.player.gold < fold (+) (x:goods -> x.cost)
this.displayNotEnoughGold()
else
this.shop.buy {goodToBuy}
...
this->buyButton->onClick([this](){
std::vector<GoodInShop> goodsToBuy;
for (auto& child: this->goodsBox) {
if auto (auto goodToBuy = Cast<GoodInShop>(child)) {
if auto (goodToBuy.Selected)
goodsToBuy.push_back(goodToBuy.good)
}
}
if auto (goodsToBuy.empty())
displayNoGoodsSelected();
else if auto (player->gold() >= std::ranges::accumulate(goods, Goods::cost))
displayNotEnoughGold();
else
this->shop->buy(goodsToBuy);
});
...
...
button <- makeButton "BuyButton" $ do
goodsToBuy <- getSelectedGoods
buy shop goodsToBuy
...
getSelectedGoods :: GUI [GoodInShop]
getSelectedGoods = do
...
Now, type system quickly ensures that the player will never have less gold than needed on calling buy
, and it won’t allow passing an empty array or any other incorrect data. It would make sense to name give these types/rules simple names and reuse them across the project.
Asking The Type System
No one would want to rewrite parts of conditions like in the previous example. You would either alias each of intersection operands and test separately, or check for generic error. Maybe there could be a special tool that would return true if an argument can be applied by function, kind of like SFINAE in C++:
if fits glDrawElementsInstanced mode count _ indexBuffer _
...
else
fail "incorrect buffer size specified"
if constexpr (fits<glDrawElementsInstanced, mode, count, _, indexBuffer, _>())
...
else
return std::unexpected("incorrect buffer size specified");
if fits glDrawElementsInstanced mode count _ indexBuffer _ then
...
else
fail "incorrect buffer size specified"
Perhaps we could wrap it into a language’s error management systems/patterns somehow (I suggest to read all examples
try
print (read MyStructure)
catch error: TypeError
print e.message
try constexpr { // haha
...
} catch (std::type_error e) {
...
}
parseUser :: String -> String -> TypeCheck User
parseUser name description =
parsedName <- parseName name
parsedDesc <- parseDesc desc
return $ User name desc
...
checkType (parseUser "Jane Doe" "123")"
Synergizing Types
To provide easiest experience when mixing runtime and compile-time computations, we could not only erase the boundary between type-level entities and value-level entities, but also erase the boundary between types and collections.
In the example below, types/functions could have other operations bound to them, allowing them to be used in listing all values from the time, like in collection. Certain things, like Integers could have these operations defined so that their behaviour is obvious.
for i: (>0) // error: no end condition
print i
for i: (<10) // error: no start condition
print i
for 0 < i < 10
print i // 12345678910
for (auto i: moreThan(0)) // error: no end condition (or `begin`)
std::cout << i;
for (auto i: lessThan(10)) // error: no start condition (or `end`)
std::cout << i;
for (auto i: values<intersect<lessThan(0), moreThan(10)>>())
std::cout << i; // 12345678910
map print (>0) -- endless print
map print ((>0) & (<10)) -- 12345678910
So, “more-than” type would provide a way to get the beginning of the collection, “less-than” specifies an ending, and integer would tell the compiler to increment value to get the next value. All three combined allow you to go through each value.
Case Study: Calculated Values as Evidence
Let’s review something less “arithmetic” in the context of this idea.
Creature = has
...
NearEnemy c: Creature = c.faction != this.faction
distance c.position this.position < this.attackRange
attack target: this.NearEnemy =
target.applyDamage this.getDamage
attackClosestEnemy() = do
nearEnemies = world.creatures
this.NearEnemy
attack (world.random.pick nearEnemies)
class Creature {
...
auto NearEnemy<const Creature& c> =
intersect<condition(c.faction != faction),
condition(distance(c.position, position) < attackRange)>;
void attack(const NearEnemy& target) {
target.applyDamage(getDamage());
}
void attackClosestEnemy() {
auto nearEnemies = values(intersect<world->creatures,
NearEnemy>);
attack(world.random.pick(nearEnemies));
}
};
nearEnemy :: Creature -> Creature -> Boolean
nearEnemy c other = faction c != faction other
&& distance (position c) (position other) < attackRange c
attack :: Creature -> NearEnemy -> GameState ()
attack this enemy = applyDamage enemy (damage this)
attackClosestEnemy :: Creature -> GameState ()
attackClosestEnemy this =
creatures <- getCreatures
let nearEnemies = creatures & nearEnemy
randomCreature <- randomPick nearEnemies
attack this randomCreature
All the work happens in nearEnemies
- it goes through all creatures in the world and filters only those from different faction and in near attack range. attack
then takes a value that is definitely a NearEnemy. I think this could be a great improvement in readability, coding speed, and safety.
Conclusion
Of course, by “simpler types” I mean that they shouldn’t be simple in terms of theory behind them. Supporting things above with at compile type and with type erasure could take years of research and implementation. But it will be simpler from user perspective: there should less rules and limitations of what a user can write. For example, by C++17 standard, you are not allowed to capture structured bindings into lambdas. This is a pointless limitation that was fixed in the next standard. Not allowing runtime values in types is similar too. Whatever you write, it should work as expected, like a blackboard.