Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I used to do a lot of machine learning code in go and think it has great potential as a compiled, static language with similar ease of development to python.

However it is hard to get around the lack of operator overloading and (to a lesser extent at least to me) generics. I love the simplicity of the language and understand their feeling that operator overriding is too often abused but at the same time not being able to use algebraic operators for matrix and tensor libraries makes them really hard to use.

The compacting garbage collector can also make it hard to pass pointers to memory to non go libraries which is key in data science.

If this project could address those things I think it could have real potential



> With similar ease of development to Python Isn't the goal of general typed languages like Go or Rust to run -not build- the scripts of softwares in data science for example? I wouldn't compare Python and Go, it's different use case to me.

While Go looks to be in the middle, Rust is at the opposite of Python and it must be a good to choice for building data software that run data scripts.

> The [Go] lack of operator overloading => https://doc.rust-lang.org/rust-by-example/trait/ops.html

> The [Go] lack of generics => https://doc.rust-lang.org/book/ch10-01-syntax.html

> not being able to use algebraic operators for matrix and tensor libraries https://tensorflow.github.io/rust/tensorflow/struct.Tensor.h...


One of the original intents of go was to make a static, compiled language that felt familiar to python/ruby programers. This manifests as a really concise syntax (type inference via := etc) and a tight development loop enabled by fast compilation times (enabled by being strict about unused dependencies etc).

I was for a time optimistic you could use it as your scripting language without much downside and get all the upside of compiled static types. Rust looks cool and I want to do a project in it at some point but at the moment I'm most optimistic about python with optional type annotations that are understood by compilers and alternative runtimes.


At Google, Go is mostly used for stuff that they would have used Python for in the past. Idk about the rest of the world.


Currently working as a backend dev in a mid-sized company. Current directive is a gradual migration to Go for backend services that used to be written in Python/Django.


Why?


Go is a more restrictive language, which makes it slightly harder to create horrible codebases. It's also faster and a bit cheaper to deploy.


> which makes it slightly harder to create horrible codebases

Going to have to strongly disagree. It forces you to make horrible codebases with endless boilerplate code and increased complexity introduced by workarounds for abstractions you can suddenly no longer make due to questionable language limitations. You will get improved performance, however.


I've seen people complain about that, but I've been using golang for over two years, and I haven't really had to face that pain, yet. I used python for twenty years prior to that, and love sophisticated programming constructions (did a lot of work with clojure, learnt haskell, went through On Lisp), so it's not as if I don't know what I'm missing.


Any abstraction possible in Python can be expressed in Go just via the interface{} type, as the type of everything in Python is just interface{}.


No, that's not true at all. Just try to create an OrderedMap that supports the same abstract interface as Go's built-in map type, or try to implement a decimal floating point type that supports the same operators as the built-in binary floating point type. It's not possible.


Whether something can technically be done and whether it is good/easy/simple/etc. are totally different conversations. I'm pretty sure you can't implement a min function that works on both strings and ints in Go by using the interface{} abstraction.


Interesting, I wouldn’t have thought of Go for ML. But I do share the enjoyment of static languages for Ml/data science. You might give Nim a look as it’s pretty practical for wrapping C++ code!


Ref operator overloading. As someone not used to python but had to read a simple numpy script last week, I was stumped for a while on this line of code: X[y==1,0] Just that.I first thought, what would X[False,0] be? Since y was a vector, it obviously wouldn't be equal to one. Okey, but extracting that part, it looks like y==1 takes my vector, and replaces with an array of same size, with true or false for each element. Basically == is overridden to run a predicate over all elements. Okey, but then what does X[[True,False,False..],0] mean? Looks like numpy has overridden the [] so one can pass an array of booleans in addition to a normal index, and then it only keeps those elements corresponding to True indexes.

Clever and useful when done daily I guess, but damn it was hard to understand those 9 characters as someone not well-versed in this domain.


I never understand why operator overloading is said to make things more readable.

If the meaning of an operator can change wildly with the operands then that's just confusing - you can't assume that '==' means what you think it means and you have to go find out what it means.

In comparison, having an actual function name to clue me in on what something does is useful. Like, how is "X[y==1,0]" more readable in this case than something like "filterElements(arrayToFilter, arrayOfBools)"? (if I've understood what the original was trying to do, which I'm not sure I have).

People seem to confuse "less typing" with "simpler", and that's not true. One of the great strengths of Go is that it rejects this and embraces true simplicity.


> I never understand why operator overloading is said to make things more readable.

Because, used properly, it does.

> If the meaning of an operator can change wildly with the operands then that's just confusing

Yes, irresponsible use of operator overloading makes things confusing.

Overloading enables preserving existing semantics with new types that have similar semantic roles, it also enables natural, concise, domain specific notation which may sometimes have different semantics than the standard use (while wild, unpredictable semantic swings hurt readability, humans are naturally quite good at incorporating context into interpretation of symbols/language, and avoiding context sensitivity for naive simplicity does not aid readability.)

Verbosity can be quite bad for the ability to quickly grasp the meaning of things.

> People seem to confuse "less typing" with "simpler

Conciseness (not mere terseness, but clarity and terseness together) greatly aid readability. Verbosity is not zero-cost.


> Conciseness (not mere terseness, but clarity and terseness together) greatly aid readability. Verbosity is not zero-cost.

I've been coding for 40-ish years. I've never found this to be true. Simple expressions are (in my experience) more readable.

I understand it like this: to understand a complex expression you have to unpack it in your head to a simpler version in order to grok it. This is an operation you don't need to do if the expression is in the simpler, more verbose, version in the first place.

This is a known thing in writing, btw - complex sentences are harder to read. If you want your audience to understand you, write more, simpler, sentences.


> I've been coding for 40-ish years.

Good for you, I've only been coding for 38 years.

> Simple expressions are (in my experience) more readable.

Simple is not the inverse of concise; there may be times when simpler expressions are more verbose, but that's not even approximately generally the case. “x²+1” and “x*2+1” and “add(pow(2,x),1)” and “x raised to the second power plus one” are equally simple (or, at least, the later ones are not more simple), but they are progressively less concise.

(It's true that expanding the space of concise expressions may require more complex notation, and when the notation is unfamiliar, that creates a learning curve for learning the notation, but there's a reason people familiar with domains develop notations that support more concise expressions.

> I understand it like this: to understand a complex expression you have to unpack it in your head to a simpler version in order to grok it.

That's true of complexity of expressions, but again that's not the issue here. And concise notation expands the kind of expressions that can be grokked by pattern recognition rather than unpacking.


I think for terser expressions to be more readable, the reader has to be more context-aware and generally more immersed in the paradigm. There's an understanding of the language that needs to be acquired.

Less terse language relies less on shared context, and thus is easier on newbies. There is less assumed knowledge, more things made explicit.

> And concise notation expands the kind of expressions that can be grokked by pattern recognition rather than unpacking.

I have this totally the other way. After years of coding in Go, I can parse "if err != nil" subconsciously and only ever deal with it if it's not that (e.g. if err == nil). It's not concise, but it is very, very easy to read.


I can comfortably display maybe sixty lines on my screen. “if err != nil” wastes three of them, every time I do anything. I don’t want to explicitly bail out on an error for the same reason I don’t want to explicitly set up a stack frame or interpolate values into a string. I only want to deal with how this program is different than other programs, not the mechanics of how f(g(x), h(y)) is orchestrated.

Any worthwhile tool is going to be used for years, and you’re only going to be newbie for a small fraction of the time. It’s better to invest time learning a good notation than to force all the expensive experts to slog through a bad notation forever.


dude, scrolling the page is literally a finger on the mouse wheel. I don't think "I need to see my entire program in one 60-line screen" is a good dynamic for coding.

Explicitly handling errors is one of those things that you get used to, for really, really, good reasons, when learning Go.

> Any worthwhile tool is going to be used for years, and you’re only going to be newbie for a small fraction of the time. It’s better to invest time learning a good notation than to force all the expensive experts to slog through a bad notation forever.

No, because assuming the next developer knows as much as you is probably wrong. Because reading code you wrote 6 months ago is like reading an alien script. And because Go (for very, very good reasons) optimises readability over terseness.


Long expressions with matrix operations is a pretty standard example. When people talk about operator overloading in data science, they usually mean “standard operations on various arrays of numbers,” which are defined in common libraries or the programming language. Not “I need to define my own ad hoc equalities.”


yeah, I get this. If there are standard definitions of operations that everyone understands, that's fine.

But I always think that maybe we should be using new operators for this, instead of overloading existing ones that have other, different, meanings in different contexts.


In a data science context, the key operations are math, so overloading makes a lot of sense and is massively helpful in implementing algorithms and equations. I go back and forth on the wisdom of some of the other common uses — filtering, etc. In addition to the problems that have been mentioned, there are often hidden and infrequent but painful performance issues.


I often think that maths could use the same slap around the chops. Less arcane operators and symbols, more explicit function names please!


I consider it kind of important that the notation for expressions like “A²” doesn’t depend on whether A is an integer, real number, complex number, matrix, random variable, etc., (even if the results do) or what the specific domain is, but if you feel like it’s important to embed all of that context in the exponent operator... give it a try :)

(And whether “2” is integer, real, rational, complex, etc)


Yeah, but operator overloading doesn't say any of that. You have no idea what the "^" operator does, depending on the operands


Ehhh, it would seem that way, but the compactness of the syntax functions to get out of the way and help you understand the overall structure. Having longer function names ends up getting in your way more often than not in my experience.


It's not so much a matter of reduced typing as that if you're invoking an operation many times, developing a concise notation for it can cut down on the noise it creates for a reader. It should be used very sparingly and heavily documented, though, for exactly the reason you outline.

It really comes down to who you're writing the code for. For something like numpy, whose users will mostly be familiar with matrix notations, operator overloading enables a huge improvement.


> I never understand why operator overloading is said to make things more readable.

Ocaml doesn't overload even the arithmetic operators, so you write for integers

    1 + t * A
and for floating point

    1 +. t *. A
and for matrices you would make something like

    scal_mat_add(1, scal_mat_mul(t, A))
Do you really prefer these three, over writing

    1 + t * A
for all cases?


> how is "X[y==1,0]" more readable in this case than something like "filterElements(arrayToFilter, arrayOfBools)"?

Just the same way that a[i] *= b[j] is more readable than a.IndexElement(firstIndex).MultiplyByFloat(b.IndexElement(secondIndex))


yeah, sorry, but I didn't understand the first one at all, and totally understood the second one. Your definition of "readable" and mine differ ;)


I don't follow. You have been coding for 40 years and you do not understand a[i] *= b[j] at all but you understood the expression I made up?


Bah, it's the same as Maths: notations 'compress' the formulas but at the cost of having to learn these notations..


yes, this. Completely.

Is that more readable or less?


For who? Beginners or experts?


Experts. Optimising your whole notation for beginners is pre-emptively putting up a skill ceiling. Beginners stop being beginners (at which point they’ll outgrow the beginner oriented syntax) but experts will remain experts.

Instead, optimise for teaching/learning the skills better rather than capping everyone’s skills. The presence of a learning curve is not an inherently bad thing.

Edit: re-reading your previous comments, I think you and I are in furious agreement haha


That example is using “logical vectors”, which you’d come across in more data-science languages like Matlab, Octave, R, etc. Julia[1] has a more modern take on y==1, by having explicit syntax for element-wise operations, so it uses y.==1 instead.

What I’m really saying is that there’s quite a bit of precedent for that syntax, but it comes from a more specialised field so it is easy to have not come across it before.

[1] https://docs.julialang.org/en/v1/manual/functions/#man-vecto...


MATLAB introduced automatic broadcasting of operators over n-dimensional arrays and logical indexing nearly 40 years and it is still the primary learning language for applied mathematicians, engineers, and scientists, and also a popular prototyping language for numerical algorithm developers. And it provides a great interactive REPL with built-in plotting for exploratory data analysis.

Since doing this, the idea and basic syntax has been adopted by GNU Octave, S, R, and now NumPy and Matplotlib, which did it to make it easier for statisticians, engineers, and scientists to adopt Python. Specifically targeting these groups with familiar syntax is exactly why Python is so popular for data science, because data scientists tend to recruited from the hard engineering and science disciplines. It's a lot easier to teach basic programming to someone with a great background in applied math, experimental design, and research methods, than it is to teach all those things to programmers.

This is an area in which languages with operator overloading shine, creating DSLs that mimic the syntax and semantics of other languages. You might have a lot to learn because you're used to == only being defined for scalar data types and arrays only being indexed by natural numbers, but the people the language is designed for are used to broadcasted operators and logical array indexing.


I find this is common in python: there are nice shorthand things you can do that are definitely powerful, but they are not easy to understand nor to remember. Particularly with conditions applied to arrays / series this is a problem. "Truth of a series is ambiguous" is one of my most frequent errors.

That said, the overall ecosystem still makes python the most practical general data science language in my view.


Operator overloading and the Go->C FFI are pretty big hinderances.

Go just wasn’t designed for this kind of work. Which is unfortunate because it brings a lot of great things to the table.

Vlang is probably the closest spiritual successor that would work, or someone just needs to write a new language




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: