▲Understanding Neural Network, Visuallyvisualrambling.space

261 points by surprisetalk 4 days ago | 35 comments

helloplanets 14 hours ago [-]

For the visual learners, here's a classic intro to how LLMs work: https://bbycroft.net/llm

tpdly 14 hours ago [-]

Lovely visualization. I like the very concrete depiction of middle layers "recognizing features", that make the whole machine feel more plausible. I'm also a fan of visualizing things, but I think its important to appreciate that some things (like 10,000 dimension vector as the input, or even a 100 dimension vector as an output) can't be concretely visualized, and you have to develop intuitions in more roundabout ways.

I hope make more of these, I'd love to see a transformer presented more clearly.

esafak 15 hours ago [-]

This is just scratching the surface -- where neural networks were thirty years ago: https://en.wikipedia.org/wiki/MNIST_database

If you want to understand neural networks, keep going.

abrookewood 7 hours ago [-]

Which, if you are trying to learn the basics, is actually a great place to start ...

droidist2 55 minutes ago [-]

Really cool. The animations within a frame work well.

brudgers 2 days ago [-]

The original Show HN, https://news.ycombinator.com/item?id=44633725

vivzkestrel 4 hours ago [-]

- while impressive, it still doesnt tell me why a neural network is architected the way it is and that my bois is where this guy comes in https://threads.championswimmer.in/p/why-are-neural-networks...

- make a visualization of the article above and it would be the biggest aha moment in tech

swframe2 8 hours ago [-]

This Welch Labs video is very helpful: https://www.youtube.com/watch?v=qx7hirqgfuU

chan1 7 hours ago [-]

Super cool visualization Found this vid by 3Blue1Brown super helpful for visualizing transformers as well. https://www.youtube.com/watch?v=wjZofJX0v4M&t=1198s

bilbo-b-baggins 1 hours ago [-]

Their series on LLMs, neural nets, etc., is amazing.

vicentwu 2 hours ago [-]

I like the CRT-like filter effect.

ge96 14 hours ago [-]

I like the style of the site it has a "vintage" look

Don't think it's moire effect but yeah looking at the pattern

Bengalilol 11 hours ago [-]

Lucky you!

<https://visualrambling.space/dithering-part-1/>

<https://visualrambling.space/dithering-part-2/>

ge96 11 hours ago [-]

Oh god my eyes! As it zooms in (ha)

That's cool, rendering shades in the old days

Man those graphics are so good damn

8cvor6j844qw_d6 12 hours ago [-]

Oh wow, this looks like a 3d render of a perceptron when I started reading about neural networks. I guess essentially neural networks are built based on that idea? Inputs > weight function to to adjust the final output to desired values?

mr_toad 7 hours ago [-]

The layers themselves are basically perceptrons, not really any different to a generalized linear model.

The ‘secret sauce’ in a deep network is the hidden layer with a non-linear activation function. Without that you could simplify all the layers to a linear model.

sva_ 8 hours ago [-]

A neural network is basically a multilayer perceptron

https://en.wikipedia.org/wiki/Multilayer_perceptron

adammarples 10 hours ago [-]

Yes, vanilla neural networks are just lots of perceptrons

jazzpush2 12 hours ago [-]

I love this visual article as well:

https://mlu-explain.github.io/neural-networks/

4fterd4rk 16 hours ago [-]

Great explanation, but the last question is quite simple. You determine the weights via brute force. Simply running a large amount of data where you have the input as well as the correct output (handwriting to text in this case).

ggambetta 15 hours ago [-]

"Brute force" would be trying random weights and keeping the best performing model. Backpropagation is compute-intensive but I wouldn't call it "brute force".

Ygg2 15 hours ago [-]

"Brute force" here is about the amount of data you're ingesting. It's no Alpha Zero, that will learn from scratch.

jazzpush2 12 hours ago [-]

What? Either option requires sufficient data. Brute force implies iterating over all combinations until you find the best weights. Back-prop is an optimization technique.

Ygg2 3 hours ago [-]

In context of grandparents post.

     > You determine the weights via brute force. Simply running a large amount of data where you have the input as well as the correct output

Brute force just means guessing all possible combinations. A dataset containing most human knowledge is about as brute force as you can get.

I'm fairly sure that Alpha Zero data is generated by Alpha Zero. But it's not an LLM.

jetfire_1711 11 hours ago [-]

Spent 10 minutes on the site and I think this is where I'll start my day from next week! I just love visual based learning.

cwt137 14 hours ago [-]

This visualizations reminds me of the 3blue1brown videos.

giancarlostoro 14 hours ago [-]

I was thinking the same thing. Its at least the same description.

11 hours ago [-]

shrekmas 9 hours ago [-]

As someone who does not use Twitter, I suggest adding RSS to your site.

atultw 5 hours ago [-]

Nice work

artemonster 12 hours ago [-]

I get 3fps on my chrome, most likely due to disabled HW acceleration

nerdsniper 12 hours ago [-]

High FPS on Safari M2 MBP.

anon291 13 hours ago [-]

Nice visuals, but misses the mark. Neural networks transform vector spaces, and collect points into bins. This visualization shows the structure of the computation. This is akin to displaying a Matrix vector multiplication in Wx + b notation, except W,x,and b have more exciting displays.

It completely misses the mark on what it means to 'weight' (linearly transform), bias (affine transform) and then non-linearly transform (i.e, 'collect') points into bins

titzer 12 hours ago [-]

> but misses the mark

It doesn't match the pictures in your head, but it nevertheless does present a mental representation the author (and presumably some readers) find useful.

Instead of nitpicking, perhaps pointing to a better visualization (like maybe this video: https://www.youtube.com/watch?v=ChfEO8l-fas) could help others learn. Otherwise it's just frustrating to read comments like this.

pks016 13 hours ago [-]

Great visualization!

javaskrrt 14 hours ago [-]

very cool stuff

Loading comments...

helloplanets 14 hours ago [-]

For the visual learners, here's a classic intro to how LLMs work: https://bbycroft.net/llm

tpdly 14 hours ago [-]

I hope make more of these, I'd love to see a transformer presented more clearly.

esafak 15 hours ago [-]

This is just scratching the surface -- where neural networks were thirty years ago: https://en.wikipedia.org/wiki/MNIST_database

If you want to understand neural networks, keep going.

abrookewood 7 hours ago [-]

Which, if you are trying to learn the basics, is actually a great place to start ...

droidist2 55 minutes ago [-]

Really cool. The animations within a frame work well.

brudgers 2 days ago [-]

The original Show HN, https://news.ycombinator.com/item?id=44633725

vivzkestrel 4 hours ago [-]

- make a visualization of the article above and it would be the biggest aha moment in tech

swframe2 8 hours ago [-]

This Welch Labs video is very helpful: https://www.youtube.com/watch?v=qx7hirqgfuU

chan1 7 hours ago [-]

Super cool visualization Found this vid by 3Blue1Brown super helpful for visualizing transformers as well. https://www.youtube.com/watch?v=wjZofJX0v4M&t=1198s

bilbo-b-baggins 1 hours ago [-]

Their series on LLMs, neural nets, etc., is amazing.

vicentwu 2 hours ago [-]

I like the CRT-like filter effect.

ge96 14 hours ago [-]

I like the style of the site it has a "vintage" look

Don't think it's moire effect but yeah looking at the pattern

Bengalilol 11 hours ago [-]

Lucky you!

<https://visualrambling.space/dithering-part-1/>

<https://visualrambling.space/dithering-part-2/>

ge96 11 hours ago [-]

Oh god my eyes! As it zooms in (ha)

That's cool, rendering shades in the old days

Man those graphics are so good damn

8cvor6j844qw_d6 12 hours ago [-]

mr_toad 7 hours ago [-]

The layers themselves are basically perceptrons, not really any different to a generalized linear model.

The ‘secret sauce’ in a deep network is the hidden layer with a non-linear activation function. Without that you could simplify all the layers to a linear model.

sva_ 8 hours ago [-]

A neural network is basically a multilayer perceptron

https://en.wikipedia.org/wiki/Multilayer_perceptron

adammarples 10 hours ago [-]

Yes, vanilla neural networks are just lots of perceptrons

jazzpush2 12 hours ago [-]

I love this visual article as well:

https://mlu-explain.github.io/neural-networks/

4fterd4rk 16 hours ago [-]

ggambetta 15 hours ago [-]

"Brute force" would be trying random weights and keeping the best performing model. Backpropagation is compute-intensive but I wouldn't call it "brute force".

Ygg2 15 hours ago [-]

"Brute force" here is about the amount of data you're ingesting. It's no Alpha Zero, that will learn from scratch.

jazzpush2 12 hours ago [-]

What? Either option requires sufficient data. Brute force implies iterating over all combinations until you find the best weights. Back-prop is an optimization technique.

Ygg2 3 hours ago [-]

In context of grandparents post.

     > You determine the weights via brute force. Simply running a large amount of data where you have the input as well as the correct output

Brute force just means guessing all possible combinations. A dataset containing most human knowledge is about as brute force as you can get.

I'm fairly sure that Alpha Zero data is generated by Alpha Zero. But it's not an LLM.

jetfire_1711 11 hours ago [-]

Spent 10 minutes on the site and I think this is where I'll start my day from next week! I just love visual based learning.

cwt137 14 hours ago [-]

This visualizations reminds me of the 3blue1brown videos.

giancarlostoro 14 hours ago [-]

I was thinking the same thing. Its at least the same description.

11 hours ago [-]

shrekmas 9 hours ago [-]

As someone who does not use Twitter, I suggest adding RSS to your site.

atultw 5 hours ago [-]

Nice work

artemonster 12 hours ago [-]

I get 3fps on my chrome, most likely due to disabled HW acceleration

nerdsniper 12 hours ago [-]

High FPS on Safari M2 MBP.

anon291 13 hours ago [-]

It completely misses the mark on what it means to 'weight' (linearly transform), bias (affine transform) and then non-linearly transform (i.e, 'collect') points into bins

titzer 12 hours ago [-]

> but misses the mark

It doesn't match the pictures in your head, but it nevertheless does present a mental representation the author (and presumably some readers) find useful.

pks016 13 hours ago [-]

Great visualization!

javaskrrt 14 hours ago [-]

very cool stuff