<?xml version="1.0" encoding="UTF-8"?>
<rss  xmlns:atom="http://www.w3.org/2005/Atom" 
      xmlns:media="http://search.yahoo.com/mrss/" 
      xmlns:content="http://purl.org/rss/1.0/modules/content/" 
      xmlns:dc="http://purl.org/dc/elements/1.1/" 
      version="2.0">
<channel>
<title>Shanlong Ding</title>
<link>https://shanlong-who.github.io/DSIR-blog/</link>
<atom:link href="https://shanlong-who.github.io/DSIR-blog/index.xml" rel="self" type="application/rss+xml"/>
<description>Notes on global-health data science, R, and WHO data workflows</description>
<generator>quarto-1.9.37</generator>
<lastBuildDate>Mon, 01 Jun 2026 16:00:00 GMT</lastBuildDate>
<item>
  <title>One tidy workflow for WHO GHO and UN SDG data in R: introducing DSIR</title>
  <dc:creator>Shanlong Ding</dc:creator>
  <link>https://shanlong-who.github.io/DSIR-blog/posts/dsir-intro/</link>
  <description><![CDATA[ 





<p>If you have ever pulled indicator data from both the <strong>WHO Global Health Observatory (GHO)</strong> and the <strong>UN SDG database</strong> in the same project, you know the small frictions add up. The two APIs speak different dialects: GHO keys countries by ISO3 codes, the SDG API by UN M49 numeric codes. Their responses come back with different column names and different shapes. So before you can do any actual analysis, you end up writing — and rewriting — the same glue code to align them.</p>
<p><a href="https://shanlong-who.github.io/DSIR/"><strong>DSIR</strong></a> (“Data Science Infrastructure for Global Health”) is a small R package I wrote to take that friction away. It bundles country metadata, lightweight clients for the GHO and SDG APIs that return a <em>single shared schema</em>, and a set of WHO-style <code>ggplot2</code> and <code>flextable</code> themes so the output already looks like something you can drop into a report.</p>
<p>It is on CRAN:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">install.packages</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"DSIR"</span>)</span></code></pre></div></div>
<section id="country-metadata-you-dont-have-to-maintain" class="level2">
<h2 class="anchored" data-anchor-id="country-metadata-you-dont-have-to-maintain">Country metadata you don’t have to maintain</h2>
<p>A surprising amount of global-health code starts by hand-maintaining a list of countries, regions, and ISO codes. DSIR ships that as data. <code>who_countries</code> is a tibble of all 194 WHO Member States, with ISO3 / ISO2 / UN M49 codes, official and short names, WHO region, and a flag for Pacific Island Countries:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(DSIR)</span>
<span id="cb2-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(dplyr)</span>
<span id="cb2-3"></span>
<span id="cb2-4">who_countries <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb2-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(who_region <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"WPR"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb2-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">select</span>(iso3, name_short, is_pic)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code># A tibble: 28 × 3
   iso3  name_short        is_pic
   &lt;chr&gt; &lt;chr&gt;             &lt;lgl&gt; 
 1 AUS   Australia         FALSE 
 2 BRN   Brunei Darussalam FALSE 
 3 KHM   Cambodia          FALSE 
 4 CHN   China             FALSE 
 5 COK   Cook Islands      TRUE  
 6 FJI   Fiji              TRUE  
 7 IDN   Indonesia         FALSE 
 8 JPN   Japan             FALSE 
 9 KIR   Kiribati          TRUE  
10 LAO   Lao PDR           FALSE 
# ℹ 18 more rows</code></pre>
</div>
</div>
<p>There are also ready-made ISO3 vectors for each region — <code>wpro_cty</code>, <code>afro_cty</code>, <code>euro_cty</code>, and so on — that you can pass straight into the data functions. No more pasting country lists between scripts.</p>
</section>
<section id="a-consistent-search-fetch-clean-rhythm" class="level2">
<h2 class="anchored" data-anchor-id="a-consistent-search-fetch-clean-rhythm">A consistent “search → fetch → clean” rhythm</h2>
<p>Both API clients follow the same three-step rhythm, so once you’ve learned one you’ve learned both.</p>
<p>For GHO, you can search for an indicator, check coverage before committing to a download, then fetch and tidy:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Search indicators by keyword</span></span>
<span id="cb4-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gho_indicators</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"mortality"</span>)</span>
<span id="cb4-3"></span>
<span id="cb4-4"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Fetch premature NCD mortality for the Western Pacific, then tidy</span></span>
<span id="cb4-5">raw <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gho_data</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"NCDMORT3070"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">spatial_type =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"country"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">area =</span> wpro_cty)</span>
<span id="cb4-6"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gho_clean</span>(raw)</span></code></pre></div></div>
</div>
<p>The SDG client works exactly the same way — and crucially, its <code>area</code> argument also accepts ISO3 codes, converting to M49 internally, so the same regional vectors just work:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># 3.4.1 = NCD mortality; pass ISO3 codes directly</span></span>
<span id="cb5-2">raw <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sdg_data</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"3.4.1"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">area =</span> wpro_cty)</span>
<span id="cb5-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sdg_clean</span>(raw)</span></code></pre></div></div>
</div>
</section>
<section id="the-payoff-one-schema-so-you-can-just-stack-them" class="level2">
<h2 class="anchored" data-anchor-id="the-payoff-one-schema-so-you-can-just-stack-them">The payoff: one schema, so you can just stack them</h2>
<p>Here is the part that saves the most time. <code>gho_clean()</code> and <code>sdg_clean()</code> both return the <strong>same 15-column schema</strong> (<code>source</code>, <code>id</code>, <code>indicator</code>, <code>location</code>, <code>iso3</code>, <code>year</code>, <code>value</code>, <code>value_num</code>, …). Because the shape is identical, GHO and SDG output combine directly — no manual renaming, no reconciling code systems:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1">gho <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gho_data</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"NCDMORT3070"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">area =</span> wpro_cty) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gho_clean</span>()</span>
<span id="cb6-2">sdg <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sdg_data</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"3.4.1"</span>,        <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">area =</span> wpro_cty) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sdg_clean</span>()</span>
<span id="cb6-3"></span>
<span id="cb6-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bind_indicators</span>(gho, sdg)   <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># keep track of origin via the `source` column</span></span></code></pre></div></div>
</div>
<p>That <code>source</code> column means you never lose track of where a row came from, even after you’ve stacked half a dozen indicators from both APIs.</p>
</section>
<section id="charts-and-tables-that-already-look-like-who" class="level2">
<h2 class="anchored" data-anchor-id="charts-and-tables-that-already-look-like-who">Charts and tables that already look like WHO</h2>
<p>The other recurring cost in this kind of work is making every chart and table look consistent across a report. DSIR includes publication-ready themes so you don’t restyle from scratch each time. Here <code>theme_dsi()</code> and <code>scale_y_dsi_col()</code> (which removes the gap between bars and the axis) do the work:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggplot2)</span>
<span id="cb7-2"></span>
<span id="cb7-3">who_countries <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb7-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">count</span>(who_region) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb7-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">reorder</span>(who_region, n), n)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_col</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"#0093D5"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_flip</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_y_dsi_col</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-9">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_dsi</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-10">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"WHO Member States by region"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">NULL</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">NULL</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://shanlong-who.github.io/DSIR-blog/posts/dsir-intro/index_files/figure-html/unnamed-chunk-5-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>For multi-panel layouts there’s <code>theme_dsi_facet()</code>, and for tables <code>dsi_flextable_defaults()</code> sets booktabs styling, bold headers and sensible padding in one line. The point is that a chart pulled straight from <code>who_countries</code> or a cleaned indicator already carries a consistent WHO look, without per-plot fiddling.</p>
</section>
<section id="try-it" class="level2">
<h2 class="anchored" data-anchor-id="try-it">Try it</h2>
<p>DSIR is on CRAN and the source is on GitHub:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">install.packages</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"DSIR"</span>)</span></code></pre></div></div>
<ul>
<li>Documentation and a fuller walkthrough: <a href="https://shanlong-who.github.io/DSIR/" class="uri">https://shanlong-who.github.io/DSIR/</a></li>
<li>Source, issues, and feature requests: <a href="https://github.com/shanlong-who/DSIR/" class="uri">https://github.com/shanlong-who/DSIR/</a></li>
</ul>
<p>It’s early days and the package is small by design — it does the unglamorous data-plumbing so the analysis can start sooner. If you work with WHO or SDG data in R and find it useful, I’d love to hear what’s missing; issues and suggestions are very welcome. And if it saves you some glue code, a GitHub star helps other people working on global health find it too.</p>


</section>

 ]]></description>
  <category>R</category>
  <category>global health</category>
  <category>WHO</category>
  <category>SDG</category>
  <category>ggplot2</category>
  <guid>https://shanlong-who.github.io/DSIR-blog/posts/dsir-intro/</guid>
  <pubDate>Mon, 01 Jun 2026 16:00:00 GMT</pubDate>
  <media:content url="https://shanlong-who.github.io/DSIR-blog/posts/dsir-intro/index.jpg" medium="image" type="image/jpeg"/>
</item>
</channel>
</rss>
