-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Meta] Are we scientists yet? #77
Comments
Placeholder. To avoid polluting this meta-thread with specific discussion on certain topics (say what I want in the random library), this will link to the discussion topics: Multidimensional arrays, Linear-algebraPlottingGeospatialImage processingDataframes, columnar/tabular data processingRandomStatisticsMachine learningDeep learningNo issue open Computational Geometry |
For sampling from other distributions, there is Alea. I have to clean it up - some examples fail with the latest concept changes in devel - but I hope to make these work again soon |
This almost makes me want to buy arewescientistsyet.org ala http://www.arewewebyet.org/. Perhaps you'd be interesting in creating something like this? :) |
I would also add in differential equation solvers as well as Markov chain Monte Carlo samplers... |
Over the last 2 months I've been working on high level bindings to the HDF5 library: https://github.com/Vindaar/nimhdf5 It's still very much work in progress (also due to my limited knowledge of Nim and the more low level parts of HDF5). |
|
By far the most important category is missing from this list I feel; and that is first-class two way python bindings. The ability of python to easily (relatively, for the time) interface with the then-dominant languages was pivotal in its adoption in scientific computing. Id use a ton of nim from python right away if there was a clean, boiler plate free method of sending ndarrays back and forth between the two. Last time I checked there was not, and as much as i like nim I dont see it replacing my entire python ecosystem any day soon. In particular, I would much rather use nim than cython or numba or any such half-baked language. Boost-python has the bindings figured out pretty well but then again I can rarely justify having to deal with C++. But a system of bindings with the convenience of boost-python but without the C++ would massively expand the usability of nim for my (and I think its not just me) scientific programmers. Also, starting out a project in nim would be a much better proposition if i had the reassurance I could always pop up a matplotlib debug figure without any hassle. |
@EelcoHoogendoorn there are a few projects.
None of these projects is fully mature at this point, but this is definitely something doable |
Of course it is doable; both Python and nim are Turing complete. But
without having the time to put in the work to make these into feature
complete mature solutions myself, it is what is stopping me from using nim
at present.
The good news is that this should be a lot less work than reinventing
matplotlib.
…On May 2, 2018 15:29, "Andrea Ferretti" ***@***.***> wrote:
@EelcoHoogendoorn <https://github.com/EelcoHoogendoorn> there are a few
projects.
- nim-pymod <https://github.com/jboy/nim-pymod> is not mantained and a
little cumbersome in that it requires its own scripts to build, but it
allows to send ndarrays back and forth
- nimpy <https://github.com/yglukhov/nimpy> looks more actively
mantained but I am not sure whether it supports Numpy types
- python3 <https://github.com/matkuki/python3> seems to be another
one, but I am not sure of its status
None of these projects is fully mature at this point, but this is
definitely something doable
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#77 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABt1BZQX3jCaLkItgxJvCC2tRNjxO9Tbks5tubTPgaJpZM4Qh_O5>
.
|
I think most active nim users are aware of this by now, but there's a functioning plotting library here: https://github.com/brentp/nim-plotly since it serializes to json and uses plotly.js to plot (but it works for the C backend), it will have a limited number of points, but when using webGL it can plot ~200K points in my browser and still be tolerably responsive. |
Hi brentp; Thats looking pretty cool indeed! Note that I am not trying to take a jab at plotting in nim specifically, but trying to make a point about the relative size of the ecosystem of python and nim generally; plotting is just an example. I think itd be foolish to expect nim to be able to compete with python anytime soon on that front; making sure we have first-class two-way interop between the two sounds like it might happen a decade sooner at least. |
And finally we can do non-linear least square fitting in Nim :) |
Finally spent some time to make the interface for my NLopt wrapper nicer and create a PR for nimble for it. |
For some precision engineering/scientific applications, the ability to use arbitrary precision floating point arithmetic would be useful. Does an MPFR wrapper a la Julia's built-in support for BigFloat belong on this list? |
@abudden Certainly. |
it seems that there is still no |
a decent stats package would be a huge boon for my work. Even if it started with t-test and anova. |
https://github.com/fragcolor-xyz/nimtorch Full pytorch for nim, for you. |
Do we want a category for natural language processing? Examples of Python libraries are nltk, gensim, spacy, and scikit-learn. |
Also, how about mathematical optimization - like scipy.optimize for example, and how about signal processing - like scipy.signal? |
@ihendley I think so, yes. |
SimulationWhat about simulation? Something like simulink, modelica or Modia (in Julia). It would be nice something similar to Modia in particular, given Nim's metaprogramming capabilities. One area where I believe nim could shine is in exporting FMU model (following the FMI standard). I don't see python doing that. An even for Julia is a struggle because they need to export the runtime for compiled stuff which is big and not straightforward (here you can see how the libraries take above 100Mb for a simple example, when compiled ahead of time). Relevant linksFMI Code Generator |
It's been a while since I updated the original post but it's done :) |
having a (nearly?) fully functional jupyter kernel would be quite useful for my work and, I suspect for many people. |
@brentp: There is (or was) I once started playing around with HCR, but wasn't very successful even implementing a trivial repl, https://github.com/vindaar/brokenrepl. Posting it here if anyone wants to give it a try. |
yes, I saw that and inim from @stisa, now that there are ggplots and dataframes, the notebook would a be a boon. |
(my) jupyternim and inim are the same code, there was a naming conflict with https://github.com/AndreiRegiani/INim so I renamed it. I agree it's due an update, but I have been pretty busy this year. |
I've just published a pure Nim k-d tree implementation here. |
@mratsim, @brentp, @HugoGranstrom and me chatted recently about trying to unify the science related code a little more. While we didn't decide anything specific yet, we talked about creating an organization to hold related repositories in the future: I only invited a few people that from the top of my head use Nim for science related stuff. If you want join, feel free to message me or just join the gitter channel here: https://gitter.im/SciNim/community and say hi. |
I played during easter about creating a web based on Hugo for this purpose. I am happy to provide it to you. I have uploaded it here: Feel free to use it. |
I've just released a pure Nim fixed point number library here I started working on a geometry (mainly focus on GIS and CAD) library, but it is not yet presentable :) |
My linear algebra package: https://github.com/planetis-m/manu is still in development and I am happy accept contributions. |
This is a meta-issue to keep track of discussion around Nim scientific libraries.
Primitive libraries
Decimal128: https://github.com/JohnAD/decimal128
Fixed-point: https://gitlab.com/lbartoletti/fpn
Multidimensional arrays, Linear-algebra
Multidimensional arrays are the basic block of scientific computing, it goes beyond the 2D or 3D vectors and matrices. Notable non-Nim implementations include Fortran, Julia, Matlab and Numpy.
Status: in-progress
Libraries:
Support
Arraymancer supports dense multidimensional arrays of any type, on CPU (integers, floats, complex), Cuda and OpenCL (float only) and uses BLAS, CuBLAS and Clblast under the hood.
Flambeau is provide libtorch bindings and reproduces PyTorch functionality.
Manu is a pure Nim matrix library with no external dependencies
Neo supports dense and sparse float vectors and matrices, on CPU and Cuda (Nvidia GPUs) and also uses BLAS and LAPACK under the hood.
Status: stalled
Libraries:
NimTorch supports most PyTorch features regarding multidimensional arrays, on CPU, Cuda, OpenCL and AMD ROCm provided you compiled PyTorch's Aten backend with the relevant features.
Plotting
Data analysis requires plotting, notable non-Nim implementations include Python matplotlib and seaborn, Plot.ly (Python, R, Javascript), R ggplot2, Matlab and Facebook Visdom (a simple interface to Plot.ly).
Note that there are a couple of approach to plotting, either having a charting library or having a high-level grammar library (similar to SQL) that hides low-level details of a chart.
Status: in-progress
Libraries:
Proof-of-concepts:
Unmaintained:
ggplotnim is an implementation in pure Nim of the graphics of grammar.
gnuplot.nim is a wrapper of gnuplot.
Nim-Plotly uses the plot.ly charting library as a backend. Both MetaPlot and Monocle uses the Vega visualization grammar.
Image processing library
Computer vision is a thriving area of research. Vision scientists needs algorithms that works on images represented as a multidimensional arrays (different from say Photoshop), preferably multithreaded and GPU accelerated.
Notable non-Nim libraries include OpenCV, Matlab, Python scikit-image, scipy.ndimage and mahotas.
Status: in-progress
Libraries:
Unmaintained:
Nim-opencv provides rough low-level bindings of OpenCV functions.
Dataframe and columnar/tabular data processing
Dataframes are essential to process structured data (say Name, Age, number of products bought, last time of visit). They allow very efficient data manipulation, including easily creating new columns, joining dataframes, converting between types.
Notable non-Nim packages include Python Pandas and R datatable. When data does not fit in RAM, dataframe packages are interfaced with SQL or HDF5 datastores or even Spark for very large scale processing.
Status: in-progress
Libaries:
Random library
Lots of scientific algorithms rely on stochastic processes or random distribution.
At the very least pseudo-random generator that samples from a normal/gaussian distribution is needed.
Notable non-Nim library include Scipy
Status: in-progress
Libraries:
Statistics library
Notable language: R
Status: standard lib statistics module
Machine learning
Machine learning is how to teach a computer to learn/generalize patterns from data.
Notable non-Nim libraries include: Python's Scikit-Learn and R's Caret.
State-of-the-art C++ library to wrap: XGBoost
Status: in-progress
Deep learning & neural network.
Deep learning is machine learning with neural networks and arguably eating the world (or atleast Reddit, Hacker News and sponsors). In comparison to most traditional machine learning tools, neural networks can also learn very well from non-structured data (images, sounds, text ...).
Notable non-Nim libraries include: Facebook Torch, Google Tensorflow, Apache and Amazon Mxnet
Status: in-progress
Libraries:
Proof-of-concept:
Non-linear optimization
Status: in-progress
Libraries:
Linear programming
Status: in-progress
Libraries:
Computational Physics
Status: in-progress
Libraries:
Geometry
Computational geometry also require tuned algorithms for: geometry primitives, polygons and polyhedron, triangulations, distances, shape analysis ...
Noteable non-Nim library: CGAL
Status: no library
Scientific serialization format
There are many formats specific to science ot even science domains.
Libraries:
Geospatial library
Often scientist needs to deal with geospatial coordinate (latitude, longitude), maps and distances.
This include efficient data-structures like KD-Tree or RTree to compute distances between points and distance formulas like Haversine to compute distance on a sphere.
Notable non-Nim libraries include Python's scipy.spatial, Geopy, Shapely
Status: in-progress
R-tree forum thread.
Proof-of-concepts:
Scientific language bindings
Python:
Unmaintained
The text was updated successfully, but these errors were encountered: