31 posts with this tag
When I first learned least-squares linear regression in my undergrad degree, I remember that we approached it in the "calculus" way: taking the sum of the squared differences for each observation and solving a massive (and tedious) equation until we …
Hierarchical clustering functionality in R is great, right? Between dist and vegdist it is possible to base your clustering on almost any method you want, from cosine to Canberra. However, what if you do want to use a different or custom method, and …
A couple of weeks ago, I wrote a post giving you an introduction to reproducible research in Python. While the principles of reproducibility stay the same no matter the language you are using, there are some specific libraries and tools that R has …
A few weeks ago, I was lucky enough to present at PyCon Australia, right here in my home town of Melbourne. My talk was on reproducible research, an increasingly important concept in science and data analysis as projects become more complicated and …
One of the things I really missed when I moved from Stata to R was how easy it was to do group operations; that is, being able to apply summary statistics by levels of a variable or variables in a dataset. Fortunately for me, I just needed a bit of …