Getting Started
Installation guides for Node and Browser based environments, including a quick 10 minute walk through of danfo.js
Installation
There are three ways to install and use Danfo.js in your application
For Nodejs applications, you can install the danfojs-node version via package managers like yarn and npm:
npm install danfojs-node
or
yarn add danfojs-nodeFor client-side applications built with frameworks like React, Vue, Next.js, etc, you can install the danfojs version:
npm install danfojs
or
yarn add danfojsFor use directly in HTML files, you can add the latest script tag from JsDelivr:
<script src="https://cdn.jsdelivr.net/npm/[email protected]/lib/bundle.min.js"></script>10 minutes to danfo.js
This is a short introduction to Danfo.js, and its flow is adapted from the official 10 minutes to Pandas
We will show you how to use danfo.js in a browser, client-side libraries, and Node.js environments. Most functions except plotting which require a DOM work the same way in all environments.
Creating a DataFrame/Series
You can create a Series by passing a list of values, letting Danfo.js create a default integer index:
Creating a Series from a tensor
Creating a DataFrame by passing a JSON object:
Creating a DataFrame from a 2D tensor
Creating a DataFrame by passing a dictionary of objects with the same length
The columns of the resulting DataFrame have different dtypes.
Creating a DataFrame by passing an array of arrays. Index and column labels are automatically generated for you.
Viewing data
Here is how to view the top and bottom rows of the frame above:
Display the index, columns:
DataFrame.tensor returns a Tensorflow tensor representation of the underlying data. Note that Tensorflow tensors have one dtype for the entire array, while danfo DataFrames have one dtype per column.
For df, our DataFrame of all floating-point values, DataFrame.tensoris fast and doesn’t require copying data.
Note
DataFrame.tensor does not include the index or column labels in the output.
describe() shows a quick statistic summary of your data:
Sorting by values (Defaults to ascending):
Selection
Getting
Selecting a single column, which yields a Series, equivalent to df.A:
Selection by label
For getting a cross-section using a label:
Selecting on a multi-axis by label:
Showing label slicing:
Selection by position
Select via the position of the passed integers:
By integer slices:
By lists of integer position locations:
For slicing rows explicitly:
For slicing columns explicitly:
Selection with Boolean Mask
You can select subsections from a DataFrame by a booelan condition mask. E.g. In the following code, we select and return only rows where the column Count is greater than 10.
A Boolean mask for filtering also works for multiple conditions using and & or functions. E.g, In the following code, we select and return only rows where the column Count is greater than 10 and column Name is equal to Apples.
Boolean Querying/Filtering
The best way to query data is to use a boolean mask just as we demonstrated above with iloc and loc. For example, in the following code, we use a condition parameter to query the DataFrame:
Querying by a boolean condition is supported from v0.3.0 and above. It also supports condition chaining as long as the final boolean mask is the same lenght as the DataFrame rows. For example in the following code, we use multiple chaining conditions:
Adding a new column
Setting a new column automatically aligns the data by the indexes.
Missing data
NaN, null, and undefined represent missing data in Danfo.js. These values can be dropped or filled using some functions available in Danfo.js.
To drop any columns that have missing data:
To drop row(s) with have missing data, set the axis to 1:
Filling missing data:
Filling missing values in specific columns with specific values:
To get the boolean mask where values are nan.
Operations
Stats
Operations, in general, exclude missing data.
Performing a descriptive statistic:
Same operation on the row axis:
Operations on objects with different dimensionality and need alignment. Danfo automatically broadcasts along the specified dimension.
Apply
Applying functions to the data along a specified axis. If axis = 1 (default), then the specified function (callable) will be called with each row data, and vice versa:
Applying Element wise operations to the data:
You can use the applyMap function if you need to apply a function to each element in the DataFrame. applyMap works element-wise.
String Methods
Series is equipped with a set of string processing methods in the str attribute that make it easy to operate on each element of the array, as in the code snippet below. Note that pattern-matching in str generally uses JavaScript regular expressions by default (and in some cases always uses them).
See more string accessors here
Merge
Concat
danfo provides various methods for easily combining together Series and DataFrame objects with various kinds of set logic for the indexes and relational algebra functionality in the case of join / merge-type operations.
Concatenating DataFrame together with concat():
Concatenate along row axis (0).
Join
SQL style merges. See the Pandas Database style joining section for more info.
See the merge section for more examples
Grouping
By “group by” we are referring to a process involving one or more of the following steps:
Splitting the data into groups based on some criteria
Applying a function to each group independently
Combining the results into a data structure
See the Grouping section.
Grouping and then applying thesum() function to the resulting groups.
Grouping by multiple columns forms a hierarchical index, and again we can apply thesum() function.
Time series
danfo provides a simple but powerful, and efficient functionality for working with DateTime data. See the dt Accessors section.
More Examples:
Plotting
See the Plotting docs.
We currently support Plotly.js for plotting. In the future, we plan other JS plotting libraries like Vega, D3.
Using the plot API, you can make interactive plots from DataFrame and Series. Plotting only works in the browser/client-side version of Danfo.js, and requires an HTML div to display plots.

On a DataFrame, the plot()method exposes various plot types. And by default, all columns are plotted unless specified otherwise.

Getting data in/out
CSV
Convert any DataFrame to csv format.
In NodeJs, if a file path is specified, then the CSV is saved to the path, else it is returned as a string.
In the browser, you can automatically download the file as CSV by setting the download paramater to true.
The readCSV method can read CSV files from local disk, or over the internet. Both full and relative paths are supported. For example, to read a CSV file at the path /home/Desktop/titanic.csv, you can do the following:
JSON
Writing to JSON format
Last updated
Was this helpful?