▲Gluon: a GPU programming language based on the same compiler stack as Tritongithub.com

83 points by matt_d 143 days ago | 24 comments

lukax 143 days ago [-]

Is this Triton's reply to NVIDIA's tilus[1]. Tilus is suposed to be lower level (e.g. you have control over registers). NVIDIA really does not want the CUDA ecosystem to move to Triton as Triton also supports AMD and other accelerators. So with Gluon you get access to lower level features and you can stay within Triton ecosystem.

[1] https://github.com/NVIDIA/tilus

jillesvangurp 142 days ago [-]

There's a lot of pressure on the CUDA ecosystem at this point:

- most of the trillion dollar companies have their own chips with AI features (Apple, Google, MS, Amazon, etc.). Gpus and AI training are among their biggest incentives. They are super motivated to not donate major chunks of their revenue to nvidia.

- Mac users don't generally use nvidia anymore with their mac hardware and the apple's CPUs are a popular platform for doing stuff with AI.

- AMD, Intel and other manufacturers want in on the action

- The Chinese and others are facing export restrictions for Nvidia's GPUs.

- Platforms like mojo (a natively compiled python with some additional language features for AI) and others are getting traction.

- A lot of the popular AI libraries support things other than Nvidia at this point.

This just adds to that. Nvidia might have to open up CUDA to stay relevant. They do have a performance advantage. But forcing people to chose, inevitably leads to plenty of choice being available to users. And the more users choose differently the less relevant CUDA becomes.

varelse 135 days ago [-]

[dead]

reasonableklout 143 days ago [-]

It sounds like they share that goal. Gluon is a thing because the Triton team realized over the last few months that Blackwell is a significant departure from the Hopper, and achieving >80% SoL kernels is becoming intractable as the triton middle-end simply can't keep up.

Some more info in this issue: https://github.com/triton-lang/triton/issues/7392

mdaniel 143 days ago [-]

Also it REALLY jams me up that this is a thing, complicating discussions: https://github.com/triton-inference-server/server

robertlagrant 142 days ago [-]

Oh! I thought it was that, having jumped straight to comments before article.

bobmarleybiceps 142 days ago [-]

it feels like Nvidia has 30 "tile-based DSLs with python-like syntax for ML kernels" that are in the works lol. I think they are very worried about open source and portable alternatives to cuda.

WithinReason 142 days ago [-]

Not at all, they are the ones pushing for vendor agnostic Tensorcore extensions in Vulkan, which would solve some part of the portability issue: https://github.com/jeffbolznv/vk_cooperative_matrix_perf

saagarjha 143 days ago [-]

I believe it’s the other way around; Gluon exposes the primitives Triton was built on top of.

YetAnotherNick 142 days ago [-]

No, gluon was in development before Tilus was announced. Could be a response to Cute DSL though.

[1]: https://docs.nvidia.com/cutlass/media/docs/pythonDSL/cute_ds...

some_guy_nobel 143 days ago [-]

Amazon (+ Microsoft) already released a language for ML called gluon 8 years ago: https://aws.amazon.com/blogs/aws/introducing-gluon-a-new-lib...

autogluon is popular as well: https://github.com/autogluon/autogluon

ronsor 143 days ago [-]

The fact that the "language" is still Python code which has to be traced in some way is a bit off-putting. It feels a bit hacky. I'd rather a separate compiler, honestly.

JonChesterfield 143 days ago [-]

Mojo for python syntax without the ast walking decorator, cuda for c++ syntax over controlling the machine, ah hoc code generators writing mlir for data driven parametric approaches. The design space is filling out over time.

pizlonator 143 days ago [-]

The fact that these are all add on syntaxes is strange. I have my ideas about why (like you want to write code that cooperates with host code).

Do any of y’all have clear ideas about why it is that way? Why not have a really great bespoke language?

saagarjha 143 days ago [-]

Hard to beat trifecta of familiar language, same source files and toolchain, JIT compiled

pizlonator 143 days ago [-]

That’s sort of what I assumed, yeah. And I think that makes sense.

But they end up adding super sophisticated concepts to the familiar language. Makes me wonder if the end result is actually better than having a bespoke language.

saagarjha 142 days ago [-]

I mean you used to be able to write TTGIR directly this is mostly sugar on top of that

zer0zzz 143 days ago [-]

This is pretty common among these ml toolchain, and not a big deal. They use pythons ast lib and the function annotations to implement an ast walker and code generator. It works quite well.

derbOac 143 days ago [-]

Yeah that struck me as odd. It's more like a Python library or something.

zer0zzz 143 days ago [-]

It’s a dsl not a library. The kernel launch parameters and the ast walk generate ir from the Python.

xcodevn 143 days ago [-]

Interesting, i can see this being very similar to Nvidia's CUTE DSL. This hints that we are converging to a (local) optimal design for Python-based DSL kernel programming.

ivolimmen 143 days ago [-]

Not to be confused with the Gluon UI toolkit for Java : https://gluonhq.com/products/javafx/

liuliu 143 days ago [-]

Or the GluonCV by mxnet guys (ancient! https://github.com/dmlc/gluon-cv)

huevosabio 143 days ago [-]

Not to be confused with gluon the embbedable language in Rust: https://github.com/gluon-lang/gluon

ericdotlee 143 days ago [-]

Why is zog so popular these days? Seems really cool but I have yet to get the buzz / learn it.

Is there a big reason why Triton is considered a "failure"?

Loading comments...

lukax 143 days ago [-]

[1] https://github.com/NVIDIA/tilus

jillesvangurp 142 days ago [-]

There's a lot of pressure on the CUDA ecosystem at this point:

- Mac users don't generally use nvidia anymore with their mac hardware and the apple's CPUs are a popular platform for doing stuff with AI.

- AMD, Intel and other manufacturers want in on the action

- The Chinese and others are facing export restrictions for Nvidia's GPUs.

- Platforms like mojo (a natively compiled python with some additional language features for AI) and others are getting traction.

- A lot of the popular AI libraries support things other than Nvidia at this point.

varelse 135 days ago [-]

[dead]

reasonableklout 143 days ago [-]

Some more info in this issue: https://github.com/triton-lang/triton/issues/7392

mdaniel 143 days ago [-]

Also it REALLY jams me up that this is a thing, complicating discussions: https://github.com/triton-inference-server/server

robertlagrant 142 days ago [-]

Oh! I thought it was that, having jumped straight to comments before article.

bobmarleybiceps 142 days ago [-]

it feels like Nvidia has 30 "tile-based DSLs with python-like syntax for ML kernels" that are in the works lol. I think they are very worried about open source and portable alternatives to cuda.

WithinReason 142 days ago [-]

saagarjha 143 days ago [-]

I believe it’s the other way around; Gluon exposes the primitives Triton was built on top of.

YetAnotherNick 142 days ago [-]

No, gluon was in development before Tilus was announced. Could be a response to Cute DSL though.

[1]: https://docs.nvidia.com/cutlass/media/docs/pythonDSL/cute_ds...

some_guy_nobel 143 days ago [-]

Amazon (+ Microsoft) already released a language for ML called gluon 8 years ago: https://aws.amazon.com/blogs/aws/introducing-gluon-a-new-lib...

autogluon is popular as well: https://github.com/autogluon/autogluon

ronsor 143 days ago [-]

The fact that the "language" is still Python code which has to be traced in some way is a bit off-putting. It feels a bit hacky. I'd rather a separate compiler, honestly.

JonChesterfield 143 days ago [-]

pizlonator 143 days ago [-]

The fact that these are all add on syntaxes is strange. I have my ideas about why (like you want to write code that cooperates with host code).

Do any of y’all have clear ideas about why it is that way? Why not have a really great bespoke language?

saagarjha 143 days ago [-]

Hard to beat trifecta of familiar language, same source files and toolchain, JIT compiled

pizlonator 143 days ago [-]

That’s sort of what I assumed, yeah. And I think that makes sense.

But they end up adding super sophisticated concepts to the familiar language. Makes me wonder if the end result is actually better than having a bespoke language.

saagarjha 142 days ago [-]

I mean you used to be able to write TTGIR directly this is mostly sugar on top of that

zer0zzz 143 days ago [-]

This is pretty common among these ml toolchain, and not a big deal. They use pythons ast lib and the function annotations to implement an ast walker and code generator. It works quite well.

derbOac 143 days ago [-]

Yeah that struck me as odd. It's more like a Python library or something.

zer0zzz 143 days ago [-]

It’s a dsl not a library. The kernel launch parameters and the ast walk generate ir from the Python.

xcodevn 143 days ago [-]

Interesting, i can see this being very similar to Nvidia's CUTE DSL. This hints that we are converging to a (local) optimal design for Python-based DSL kernel programming.

ivolimmen 143 days ago [-]

Not to be confused with the Gluon UI toolkit for Java : https://gluonhq.com/products/javafx/

liuliu 143 days ago [-]

Or the GluonCV by mxnet guys (ancient! https://github.com/dmlc/gluon-cv)

huevosabio 143 days ago [-]

Not to be confused with gluon the embbedable language in Rust: https://github.com/gluon-lang/gluon

ericdotlee 143 days ago [-]

Why is zog so popular these days? Seems really cool but I have yet to get the buzz / learn it.

Is there a big reason why Triton is considered a "failure"?