Eric J Ma's Website

« 13 14 15 16 17 »

Moving Data Securely and Quickly with `croc`

written by Eric J. Ma on 2020-10-01 | tags: tools tips tricks croc

I found a new free and open source tool for moving data between computers in a secure and fast fashion, called croc. I highly recommend it! Come read on to learn more.

Read on... (491 words, approximately 3 minutes reading time)
Tools to help you write consistent Python code

written by Eric J. Ma on 2020-09-30 | tags: data science tips and tricks programming python command line tools code formatting code style

How do you write clean Python code in the year 2020? Check out this latest essay I wrote. Cross-posted from my essays collection.

Read on... (1488 words, approximately 8 minutes reading time)
Add a direct Binder link for built HTML notebooks

written by Eric J. Ma on 2020-09-12 | tags: jupyter notebooks data science education teaching binder

I recently figured out how to dynamically insert a Binder badge into HTML pages built from Jupyter notebooks, so that users can one-click directly open a Jupyter notebook in the correct conda environment without needing to navigate or build an environment from scratch. Come see how I figured this out!

Read on... (383 words, approximately 2 minutes reading time)
Faster iteration over dataframes

written by Eric J. Ma on 2020-09-07 | tags: data science pandas tricks tips productivity

If df.iterrows() is slow, what then is the alternative? Read on to figure out how to make looping over dataframes 1000X faster :).

Read on... (154 words, approximately 1 minute reading time)
Pandera, Data Validation, and Statistics

written by Eric J. Ma on 2020-08-30 | tags: data science statistics data validation data engineering pandera software tools tips and tricks

I test-drove the Python package pandera this month, and I like its usage paradigm. Come read on to learn how you can incorporate pandera into your workflow!

Read on... (993 words, approximately 5 minutes reading time)
Software Engineering as a Research Practice

written by Eric J. Ma on 2020-08-21 | tags: data science software engineering software skills

Why do software skills matter for data scientists? We might have heard that it matters for our workflow, but what about for organizing knowledge? In this essay, I argue that practicing good software skills has those benefits and more.

Read on... (2198 words, approximately 11 minutes reading time)
Data/Software Challenges as Tools for Hiring

written by Eric J. Ma on 2020-07-26 | tags: data science hiring data challenge

In this post, I detail some of my thoughts about the use of "data challenges" for hiring data scientists. Though I have only used it in hiring data science interns, I think some of the lessons I've learned from test-driving the process can apply more generally.

Read on... (1973 words, approximately 10 minutes reading time)
Jupyter notebooks as scripts

written by Eric J. Ma on 2020-07-11 | tags: jupyter jupyter notebook notebook data science

I like writing in notebooks, for the ability to quickly prototype. But can we treat Jupyter notebooks as scripts that we can execute? The answer is yes, and in this blog post, I'll show you a few of the simplest ways to do so.

Read on... (621 words, approximately 4 minutes reading time)
How I feel about Hey

written by Eric J. Ma on 2020-06-29 | tags: reviews email general stuff

I test-drove Hey, a new email product launched by Basecamp recently, and I'm ready to pay. Read on to find out why!

Read on... (888 words, approximately 5 minutes reading time)
Statistical tests are just canned model comparisons

written by Eric J. Ma on 2020-06-28 | tags: data science bayesian statistics hypothesis testing

I came to the epiphany today that "statistical testing" protocols are nothing more than canned model comparisons (that sometimes have convoluted interpretations). Come read why!

Read on... (468 words, approximately 3 minutes reading time)
« 13 14 15 16 17 »