Ultimately, wij can preview last five rows the result using the tail() method, to make sure it looks ok.
A Data-Driven Treatment To Cryptocurrency Speculation
How do Bitcoin markets behave? What are the causes of the unexpected spikes and dips te cryptocurrency values? Are the markets for different altcoins inseparably linked or largely independent? How can wij predict what will toebijten next?
Articles on cryptocurrencies, such spil Bitcoin and Ethereum, are rife with speculation thesis days, with hundreds of self-proclaimed experts advocating for the trends that they expect to emerge. What is lacking from many of thesis analyses is a strong foundation of gegevens and statistics to backup the claims.
The aim of this article is to provide an effortless introduction to cryptocurrency analysis using Python. Wij will walk through a plain Python script to retrieve, analyze, and visualize gegevens on different cryptocurrencies. Te the process, wij will uncover an interesting trend ter how thesis volatile markets behave, and how they are evolving.
This is not a postbode explaining what cryptocurrencies are (if you want one, I would recommend this fine overview), strafgevangenis is it an opinion chunk on which specific currencies will rise and which will fall. Instead, all that wij are worried about ter this tutorial is procuring the raw gegevens and uncovering the stories hidden ter the numbers.
Step 1 – Setup Your Gegevens Laboratory
The tutorial is intended to be accessible for enthusiasts, engineers, and gegevens scientists at all skill levels. The only abilities that you will need are a basic understanding of Python and enough skill of the instruction line to setup a project.
A ended version of the notebook with all of the results is available here.
Step 1.1 – Install Anaconda
The easiest way to install the dependencies for this project from scrape is to use Anaconda, a prepackaged Python gegevens science ecosystem and dependency manager.
To setup Anaconda, I would recommend following the official installation instructions – https://www.continuum.io/downloads.
If you’re an advanced user, and you don’t want to use Anaconda, that’s totally fine, I’ll assume you don’t need help installing the required dependencies. Feel free to skip to section Two.
Step 1.Two – Setup an Anaconda Project Environment
Merienda Anaconda is installed, wij’ll want to create a fresh environment to keep our dependencies organized.
Run conda create –name cryptocurrency-analysis python=Three to create a fresh Anaconda environment for our project.
Next, run source activate cryptocurrency-analysis (on Linux/macOS) or activate cryptocurrency-analysis (on windows) to activate this environment.
Ultimately, run conda install numpy pandas nb_conda jupyter plotly quandl to install the required dependencies te the environment. This could take a few minutes to accomplish.
Why use environments? If you project on developing numerous Python projects on your rekentuig, it is helpful to keep the dependencies (software libraries and packages) separate te order to avoid conflicts. Anaconda will create a special environment directory for the dependencies for each project to keep everything organized and separated.
Step 1.Three – Begin An Interative Jupyter Notebook
Merienda the environment and dependencies are all set up, run jupyter notebook to begin the iPython kernel, and open your browser to http://localhost:8888/ . Create a fresh Python notebook, making sure to use the Python [conda env:cryptocurrency-analysis] kernel.
Step 1.Four – Invoer the Dependencies At The Top of The Notebook
Merienda you’ve got a wit Jupyter notebook open, the very first thing wij’ll do is invoer the required dependencies.
Wij’ll also invoer Plotly and enable the offline mode.
Step Two – Retrieve Bitcoin Pricing Gegevens
Now that everything is set up, wij’re ready to embark retrieving gegevens for analysis. Very first, wij need to get Bitcoin pricing gegevens using Quandl’s free Bitcoin API.
Step Two.1 – Define Quandl Helper Function
To assist with this gegevens retrieval wij’ll define a function to download and cache datasets from Quandl.
Wij’re using pickle to serialize and save the downloaded gegevens spil a opstopping, which will prevent our script from re-downloading the same gegevens each time wij run the script. The function will terugwedstrijd the gegevens spil a Pandas dataframe. If you’re not llano with dataframes, you can think of them spil super-powered spreadsheets.
Step Two.Two – Pull Openbreken Exchange Pricing Gegevens
Let’s very first pull the historical Bitcoin exchange rate for the Openleggen Bitcoin exchange.
Wij can inspect the very first Five rows of the dataframe using the head() method.
Next, wij’ll generate a elementary chart spil a quick visual verification that the gegevens looks onberispelijk.
Here, wij’re using Plotly for generating our visualizations. This is a less traditional choice than some of the more established Python gegevens visualization libraries such spil Matplotlib, but I think Plotly is a good choice since it produces fully-interactive charts using D3.js. Thesis charts have attractive visual defaults, are effortless to explore, and are very elementary to embed ter web pages.
Spil a quick sanity check, you should compare the generated chart with publicly available graphs on Bitcoin prices(such spil those on Coinbase), to verify that the downloaded gegevens is legit.
Step Two.Trio – Pull Pricing Gegevens From More BTC Exchanges
You might have noticed a hitch te this dataset – there are a few trascendental down-spikes, particularly te late 2014 and early 2018. Thesis spikes are specific to the Kritiseren dataset, and wij obviously don’t want them to be reflected te our overall pricing analysis.
The nature of Bitcoin exchanges is that the pricing is determined by supply and request, hence no single exchange contains a true ",master price", of Bitcoin. To solve this kwestie, along with that of down-spikes (which are likely the result of technical outages and gegevens set glitches) wij will pull gegevens from three more major Bitcoin exchanges to calculate an aggregate Bitcoin price index.
Very first, wij will download the gegevens from each exchange into a dictionary of dataframes.
Step Two.Four – Merge All Of The Pricing Gegevens Into A Single Dataframe
Next, wij will define a ordinary function to merge a common katern of each dataframe into a fresh combined dataframe.
Now wij will merge all of the dataframes together on their ",Weighted Price", katern.
Ultimately, wij can preview last five rows the result using the tail() method, to make sure it looks ok.
The prices look to be spil expected: they are ter similar ranges, but with slight variations based on the supply and request of each individual Bitcoin exchange.
Step Two.Five – Visualize The Pricing Datasets
The next logical step is to visualize how thesis pricing datasets compare. For this, wij’ll define a helper function to provide a single-line guideline to generate a graph from the dataframe.
Ter the rente of brevity, I won’t go too far into how this helper function works. Check out the documentation for Pandas and Plotly if you would like to learn more.
Wij can now lightly generate a graph for the Bitcoin pricing gegevens.
Step Two.6 – Clean and Aggregate the Pricing Gegevens
Wij can see that, albeit the four series go after toughly the same path, there are various irregularities ter each that wij’ll want to get rid of.
Let’s liquidate all of the zero values from the dataframe, since wij know that the price of Bitcoin has never bot equal to zero ter the timeframe that wij are examining.
When wij re-chart the dataframe, wij’ll see a much cleaner looking chart without the down-spikes.
Wij can now calculate a fresh katern, containing the promedio daily Bitcoin price across all of the exchanges.
This fresh katern is our Bitcoin pricing index! Let’s chart that katern to make sure it looks ok.
Yup, looks good. Wij’ll use this aggregate pricing series zometeen on, ter order to convert the exchange rates of other cryptocurrencies to USD.
Step Three – Retrieve Altcoin Pricing Gegevens
Now that wij have a solid time series dataset for the price of Bitcoin, let’s pull te some gegevens for non-Bitcoin cryptocurrencies, commonly referred to spil altcoins.
Step Three.1 – Define Poloniex API Helper Functions
For retrieving gegevens on cryptocurrencies wij’ll be using the Poloniex API. To assist ter the altcoin gegevens retrieval, wij’ll define two helper functions to download and cache JSON gegevens from this API.
Very first, wij’ll define get_json_data , which will download and cache JSON gegevens from a provided URL.
Next, wij’ll define a function that will generate Poloniex API HTTP requests, and will subsequently call our fresh get_json_data function to save the resulting gegevens.
This function will take a cryptocurrency pair string (such spil ‘BTC_ETH’) and come back a dataframe containing the historical exchange rate of the two currencies.
Step Three.Two – Download Trading Gegevens From Poloniex
Most altcoins cannot be bought directly with USD, to acquire thesis coins individuals often buy Bitcoins and then trade the Bitcoins for altcoins on cryptocurrency exchanges. For this reason, wij’ll be downloading the exchange rate to BTC for each coin, and then wij’ll use our existing BTC pricing gegevens to convert this value to USD.
Wij’ll download exchange gegevens for nine of the top cryptocurrencies –
Now wij have a dictionary with 9 dataframes, each containing the historical daily media exchange prices inbetween the altcoin and Bitcoin.
Wij can preview the last few rows of the Ethereum price table to make sure it looks ok.
Step Three.Trio – Convert Prices to USD
Now wij can combine this BTC-altcoin exchange rate gegevens with our Bitcoin pricing index to directly calculate the historical USD values for each altcoin.
Here, wij’ve created a fresh katern te each altcoin dataframe with the USD prices for that coin.
Next, wij can re-use our merge_dfs_on_column function from earlier to create a combined dataframe of the USD price for each cryptocurrency.
Effortless. Now let’s also add the Bitcoin prices spil a final katern to the combined dataframe.
Now wij should have a single dataframe containing daily USD prices for the ten cryptocurrencies that wij’re examining.
Let’s reuse our df_scatter function from earlier to chart all of the cryptocurrency prices against each other.
Nice! This graph provides a pretty solid ",big picture", view of how the exchange rates for each currency have varied overheen the past few years.
Note that wij’re using a logarithmic y-axis scale ter order to compare all of the currencies on the same plot. You are welcome to attempt out different parameter values here (such spil scale=’linear’ ) to get different perspectives on the gegevens.
Step Three.Four – Perform Correlation Analysis
You might notice is that the cryptocurrency exchange rates, despite their frantically different values and volatility, look slightly correlated. Especially since the spike ter April 2018, even many of the smaller fluctuations emerge to be occurring ter sync across the entire market.
A visually-derived hunch is not much better than a guess until wij have the stats to back it up.
Wij can test our correlation hypothesis using the Pandas corr() method, which computes a Pearson correlation coefficient for each katern ter the dataframe against each other katern.
Revision Note 8/22/2018 – This section has bot revised ter order to use the daily come back percentages instead of the absolute price values ter calculating the correlation coefficients.
Computing correlations directly on a non-stationary time series (such spil raw pricing gegevens) can give biased correlation values. Wij will work around this by very first applying the pct_change() method, which will convert each cell ter the dataframe from an absolute price value to a daily comeback percentage.
Very first wij’ll calculate correlations for 2018.
Thesis correlation coefficients are all overheen the place. Coefficients close to 1 or -1 mean that the series’ are strongly correlated or inversely correlated respectively, and coefficients close to zero mean that the values are not correlated, and fluctuate independently of each other.
To help visualize thesis results, wij’ll create one more helper visualization function.
Here, the dark crimson values represent strong correlations (note that each currency is, obviously, strongly correlated with itself), and the dark blue values represent strong inverse correlations. All of the light blue/orange/gray/suntan colors in-between represent varying degrees of feeble/non-existent correlations.
What does this chart tell us? Essentially, it shows that there wasgoed little statistically significant linkage inbetween how the prices of different cryptocurrencies fluctuated during 2018.
Now, to test our hypothesis that the cryptocurrencies have become more correlated te latest months, let’s repeat the same test using only the gegevens from 2018.
Thesis are somewhat more significant correlation coefficients. Strong enough to use spil the foot poot for an investment? Certainly not.
It is trascendental, however, that almost all of the cryptocurrencies have become more correlated with each other across the houtvezelplaat.
Huh. That’s rather interesting.
Why is this happening?
Good question. I’m truly not sure.
The most instant explanation that comes to mind is that hedge funds have recently begun publicly trading te crypto-currency markets  [Two] . Thesis funds have vastly more hacienda to play with than the media trader, so if a fund is hedging their bets across numerous cryptocurrencies, and using similar trading strategies for each based on independent variables (say, the stock market), it could make sense that this trend of enhancing correlations would emerge.
In-Depth – XRP and STR
For example, one noticeable trait of the above chart is that XRP (the token for Ripple), is the least correlated cryptocurrency. The extraordinario exception here is with STR (the token for Stellar, officially known spil ",Lumens",), which has a stronger (0.62) correlation with XRP.
What is interesting here is that Stellar and Ripple are both fairly similar fintech platforms aimed at reducing the friction of international money transfers inbetween banks.
It is conceivable that some big-money players and hedge funds might be using similar trading strategies for their investments ter Stellar and Ripple, due to the similarity of the blockchain services that use each token. This could explain why XRP is so much more strenuously correlated with STR than with the other cryptocurrencies.
Quick Butt-plug – I’m a contributor to Chipper, a (very) early-stage startup using Stellar with the aim of disrupting micro-remittances te Africa.
This explanation is, however, largely speculative. Maybe you can do better. With the foundation wij’ve made here, there are hundreds of different paths to take to proceed searching for stories within the gegevens.
Here are some ideas:
- Add gegevens from more cryptocurrencies to the analysis.
- Adjust the time framework and granularity of the correlation analysis, for a more fine or coarse grained view of the trends.
- Search for trends ter trading volume and/or blockchain mining gegevens sets. The buy/sell volume ratios are likely more relevant than the raw price gegevens if you want to predict future price fluctuations.
- Add pricing gegevens on stocks, commodities, and fiat currencies to determine which of them correlate with cryptocurrencies (but please reminisce the old adage that ",Correlation does not imply causation",).
- Quantify the amount of ",hum", surrounding specific cryptocurrencies using Event Registry, GDELT, and Google Trends.
- Train a predictive machine learning prototype on the gegevens to predict tomorrow’s prices. If you’re more ambitious, you could even attempt doing this with a recurrent neural network (RNN).
- Use your analysis to create an automated ",Trading Bot", on a trading webpagina such spil Poloniex or Coinbase, using their respective trading APIs. Be careful: a poorly optimized trading bot is an effortless way to lose your money quickly.
- Share your findings! The best part of Bitcoin, and of cryptocurrencies te común, is that their decentralized nature makes them more free and democratic than virtually any other asset. Open source your analysis, participate te the community, maybe write a blog postbode about it.
An HTML version of the Python notebook is available here.
Hopefully, now you have the abilities to do your own analysis and to think critically about any speculative cryptocurrency articles you might read ter the future, especially those written without any gegevens to back up the provided predictions.
Thanks for reading, and please comment below if you have any ideas, suggestions, or criticisms regarding this tutorial. If you find problems with the code, you can also feel free to open an punt ter the Github repository here.
I’ve got 2nd (and potentially third) part ter the works, which will likely be following through on some of the ideas listed above, so stay tuned for more ter the coming weeks.
Full-stack engineer, gegevens enthusiast, insatiable learner, obsessive builder. You can find mij wandering on a mountain trail, pretending not to be lost. Software Engineer @ Google NYC.
Share this postbode
Subscribe to Pauze | Better
Get the latest posts delivered right to your inbox.
Building a Full-Text Search App Using Docker and Elasticsearch
Exploring United States Policing Gegevens Using Python
Build An Interactive Spel of Thrones Ordner (Part I) – Knot.js, PostGIS, and Redis
Async/Await Will Make Your Code Simpler
Would You Sustain the Titanic? A Guide to Machine Learning ter Python
Ten Tips To Host Your Web Apps For Free
A guide to navigating of the competitive marketplace of web hosting companies and cloud service providers.
Async/Await Will Make Your Code Simpler
Get the latest posts delivered to your inbox.
I hate spam. I promise not to send many emails.