NHacker Next
login
▲Data Manipulation in Clojure Compared to R and Pythoncodewithkira.com
57 points by tosh 2 days ago | 13 comments
Loading comments...
ertucetin 2 hours ago [-]
I’ve built many different kinds of software (backend, frontend, 3D games, cli tools, code editor, and more) with Clojure and have been using it for over a decade now.

I can confidently say that, among the list I mentioned, it’s the best for data manipulation/transformation. Thanks to the author for presenting it clearly and showing how the libraries and code look across different languages, all of which do a great job.

But Clojure has its own special place (maybe in my heart as well :). I think Clojure should be used more in the data science space. Thanks to the JVM, it can be very performant (I’m looking at you, Python).

olivia-banks 34 minutes ago [-]
Having "NA" being treated as nil/null/None by default seems like it would cause the Namibia problem!
__mharrison__ 2 hours ago [-]
Good pandas and polars code should also be written in an immutable way...
epgui 2 hours ago [-]
Good python code can exist, but python makes it so easy to write bad code that good python rarely exists.
nxpnsv 2 hours ago [-]
Agree. While it is common to see code like these pandas examples, it is very possible to write these manipulations so that they return a new frame or view without changing the inputs.
soumyaskartha 2 hours ago [-]
Clojure never got the data science crowd even though the language is genuinely good for it. Always felt like a distribution problem more than a technical one.
asa400 2 hours ago [-]
Unfortunately, having to mess around with a JVM is a tough sell for a lot of data analysis folks. I'm not saying it's rational or right, but a lot of people hear "JVM" and they go "no thank you". Personally I think it's a non-issue, but you have to meet people where they are.
packetlost 3 minutes ago [-]
idk, I don't think I've had to do anything beyond install the JVM to work with Clojure. I'm not really a fan of the clj commands flag choices though (-M, -X, etc. all make no sense)
pjmlp 36 minutes ago [-]
The irony given the mess of Python setup where there are companies whose business is to solve Python tooling.
cmiles74 9 minutes ago [-]
I dunno, if you can slog through the Python ecosystem then the JVM is starting to look not so bad. Plus with Clojure you don't need to deal with the headache and heartache that is Maven.
famicom0 1 hours ago [-]
Meanwhile, I find it very annoying to deal with the litany of Python versions and the distinction between global packages and user packages, and needing to manage virtual environments just to run scripts. That being said, I am not an expert but that's always been my experience when I need to do anything Python related.
levocardia 1 hours ago [-]
In this very post you can see why: the dplyr code is just so much more readable. Like a lot of python, dplyr reads almost like pseudocode: take this dataset, select the columns that start with "bill", then filter so that bill_length is less than 30. So simple and so little fluff!
erichocean 1 hours ago [-]
> is just so much more readable

I thought that too before I learned Clojure, now I find them equally readable.