Today I am processing data from a patient level costing system (PLiCS) and trying to assess the cost distribution by the patients diagnosis.

Frequency distribution charts can be a bit painful to produce in excel so I have fallen back on my favourite tool – the pivot table which enables me to produce dynamic histogram charts within minutes.

You start with your dataset showing as a minimum, the unique patient episode identifiers, diagnosis codes and the cost per episode. Pivot this table with cost in the row area (not data) and then the unique patient episode identifiers in the data area. As I’m interested in how many epsiodes fell into each cost bin I have used the COUNT function on cds_unique_identifier.

Creating Histogram or Frequency Distribution Charts with Pivot Tables

Now to get the cost buckets or bins, you right click in any of the rows showing Total_Cost and select group. The values in the dialog box are pre-populated but you can overwrite and I have chosen to set my intervals every £30.

Creating Histogram or Frequency Distribution Charts with Pivot Tables 2

This provides the summarised output table which can then be used to drive a pivot chart – just click in the pivot table then go to the INSERT tab and select the line chart of your choice.

Creating Histogram or Frequency Distribution Charts with Pivot Tables 3

Hey presto – a distribution chart that can be altered dynamically by using the filter to bring in more or less diagnoses.

Creating Histogram or Frequency Distribution Charts with Pivot Tables 4

Now I just need to fathom out what to do with the information.

{ 3 comments }

Since we’ve moved to excel 2010 with its expanded data range the tendency has been to create ever more detailed reports. Where once we summarised data we can now leave in every last bit of detail.

Detailed reports always need summaries which is where the pivot table comes in handy. So now we have huge data sheets and a load of number crunching pivot tables to make sense of it all. It is not a huge surprise that our spreadsheets have ballooned in size.

At month end after processing the SLAM data for multiple trusts I can easily end up with spreadsheets in excess of 40Mb which is a big problem, not only do these take ages to process and take up more than their fair share of storage space but I struggle to distribute them as they won’t go through the email system.

That’s a big problem and we need a solution which will cut the size of our pivot table spreadsheets.

Pivot Cache

It helps to understand how the problem of large file sizes occurs when using pivot tables

Whenever you create a PivotTable, Microsoft Excel creates a duplicate, hidden copy of the source data. This is known as the pivot cache and enables Excel to quickly calculate and aggregate data without interfering with the original data. It optimizes processing time but it also increases the size of the spreadsheet as it effectively doubles up the amount of data stored with each spreadsheet. If you create multiple pivot tables from the same data source you can end up doubling, tripling, quadrupling etc etc according to the number of additional caches produced.

There are basically two different options for dealing with the pivot cache to reduce the size of pivot table reports, you can either force Excel not to create the Pivot Cache or use the Pivot Cache but optimize by using the same cache for each pivot and removing the original data sheet to avoid duplication.

Store Excel Pivot Table without the Pivot Cache

You can prevent Excel from creating an additional cache of data at the pivot table wizard stage or after you’ve created the pivot table by going into the pivot table options screen and deselecting the checkbox “Save source data with file”.

Save source data with file
This doesn’t sever the link with the data source so you can still refresh and reorganise your pivot but it does increase the time taken to refresh.

This method does decrease the size of the file but as you can see from the conclusion below that it is not the most efficient way of reducing file sizes.

Store Excel Pivot Table with the Pivot Cache but optimise by:

1. Ensure all pivots use same cache

In older versions of excel it was very easy to build multiple pivot tables from the same range of data but with each pivot creating it’s own cache and therefore expanding the file size. In Excel 2010 this seems to have been improved and it is much harder to build multiple cache’s. It is still worth knowing how to avoid this and force the pivot to use the same cache of data.

When I create multiple pivot tables I always copy and paste from another pivot table rather recreating from scratch. This forces it to use the same cache and ensures that any calculated fields, grouping etc are available across all pivots. This way each additional pivot is practically free in terms of additional file size.

2. Delete the data sheet

DATA TABIf you are allowing the pivot table to store the cache there is no reason to also show a data tab in your workbook as you now have two copies of the same thing.

If you delete the data tab only the cache will be saved and you will save the space associated with the original data table. It is particularly efficient as the cache does not store calculations or formatting but you can no longer refresh the table.

How to retrieve data from pivot

If you did need to retrieve the driving data and display on a single sheet just double click on the Grand Total to reveal all data.

Best of all,

3. Get Data from an External Source

By basing your pivot table on an external source (which may just be another spreadsheet) you have the best of both worlds – the pivot table can be refreshed as the external data table is amended but you also have a reduced file size as only the pivot cache is stored in memory.

Conclusion and file sizes

File Sizes with reduced pivotsThe image to the right shows the impact of these various options on the file size. As you can see the original data source is a whopping 52MB. This is pure data, it has minimal formatting and no calculations.

Saving a pivot with its associated cache increases this file size to almost 75MB. (Interestingly it hasn't doubled, which I assume relates to the cache being pure data without formatting but i could be wrong).

By removing the check box “Save source data with file” the file goes back down to the original 52MB so the pivot table itself carries very little extra burden.

The best of all options has been achieved by basing the pivot table on an external spreadsheet – in this case the data only spreadsheet, so that only the cache and pivot are stored. In this instance the file has reduced to more than half that of the original data file.

So my recommendation is to save the data sheet as a workbook and then place all associated pivot tables in a separate workbook – you can allow the pivot table to save the cache (the default position) if you want the benefit of rapid refreshing and plan to have multiple pivot tables with calculations and groupings.

File Sizes extremeFor extreme file size reduction you can combine all of the above techniques, basing the pivot table on the 52MB external data source and then removing the pivot cache reduces the file size to a tiny 17KB – notice K and not M.

{ 1 comment }

There is a division in the office between the have's and the have nots. Those that have taken the brave leap into the unknown and upgraded to Excel 2007 (or even Excel 2010) and those that are resolutely sticking to the old stalwart, Excel 2003.

It's a nuisance. Perfectly decent spreadsheets are incapacitated when you either save as an Excel 97-2003 workbook or open them in the earlier version. Pivot tables are ruined, formatting goes to pot and any spreadsheet protection you've carefully thought out is ignored.

It's Agreement of Balances time in the NHS at the moment, which means multiple accountants are sharing and working on the same excel spreadsheets. Compatibility issues within the office have caused problems and when you think that the output  needs to be shared with other NHS organisations with their own versions of Excel it becomes vital to understand  how to create a pivot table in Excel 2007 that someone can use in Excel 2003.

Saving a spreadsheet as Excel 97-2003 will not save your pivot tables

If you want your pivot table to work when somebody opens your spreadsheet in an earlier version of Excel you will need to create it in an earlier version of Excel, or at least you will have to force Excel 2007 or 2010 to work in compatibility mode. It is not sufficient to simply save your Excel workbook as an earlier version.

If you open Excel 2007 (or 2010) and create a new spreadsheet you can choose file, save-as, and create an Excel 97-2003 workbook version. If you then create a pivot table in this workbook and re-save you will be rewarded with a compatibility error message, informing you that your carefully constructed pivot table will be useless if you persist on saving it in the antiquated version you've selected.

If you search around the help files or internet for a solution you'll find that you need to create the old version while in compatibility mode. It is not altogether clear how to go out about creating a new spreadsheet in compatibility mode.

How Do You Force Microsoft Excel 2007 or 2010 to Work in Compatibility Mode?

Recognising when Excel 2007 has opened in compatibility mode

If you open an old excel spreadsheet in either Excel 2007 or Excel 2010 the new version will automatically default to compatibility mode.

Compatibility mode is effectively Excel 2003 in disguise and once you've got your version of Excel in this state you can create pivot tables safe in the knowledge that they will work in the earlier version.

You can recognise compatibility mode by the [square brackets] that appear after your file name in the workbook header. If you don't see this, you aren't in compatibility mode so don't bother creating your pivot table yet.

So to open Excel 2007 in compatibility mode you need to open an earlier version of an excel spreadsheet. If you don't have one to hand, do the following:

  1. File – New – Create Blank Workbook
  2. File – Save As – Excel 97-2003 Workbook. I suggest calling it "blank 2003 workbook".
  3. Although you now have a blank Excel 97-2003 workbook you are not yet in compatibility mode.
  4. Close the new workbook
  5. Re-open the blank Excel 97-2003 workbook. Tiresome I know.
  6. You have now forced Excel 2007 to open in compatibility mode!

You can either keep the blank Excel 97-2003 workbook handy for future use or you could save it as a 97-2003 template so that you can easily create new workbooks in compatibility mode.

{ 1 comment }

The NHS Reference or NHSRef app gave me such joy but it is so niche that I almost missed it.

The The NHS Reference iPhone app has been put together by the NHS Information Centre and is the perfect reference source for NHS professionals from clinicians, clinical coders, informatics analysts and finance staff. NHSRef functions as a user friendly version of the mammoth Code to Grouper spreadsheet that is used to derive payment HRGs from OPCS procedure codes and ICD diagnostic codes but it also appears to include the whole of the PbR Tariff Information spreadsheet.

So to review the NHSRef app, it includes:

Clinical Coding details: ICD-10, OPCS-4.6, A&E

Payment by Results (2011/12): HRG4, Exclusions, Top-Ups, A&E bands

Data Dictionary – Specialty, APC, OPA, A&E

Organisational Data: NHS Providers including MFF, top-eligibility and addresses.

I've found NHSRef to be an invaluable resource, using it to quickly access tariff information for HRGs and outpatient appointments by specialty.

If you want to track through clinical coding of patients from diagnostic and/or procedure codes it is much easier to use this app than open up the clunky code to grouper spreadsheet. I'm not in anyway a coding expert but here's a beginners guide to costing "simple" patient spells.

Beginners Guide to Costing Patient Spells using NHSRef iPhone App

Here's a simple example to demonstrate how you can use the NHSRef app to help you cost up a patient spell.

You need to start with the patients case notes so you can determine the diagnosis and procedures (if any).

I've searched for the relevant coding and entered the ICD-10 and OPCS codes above. I've also shown the procedure hierarchy which is clearly shown in the NHSRef OPCS tables. The hierarchy is important for grouping to spell HRGs. Both the ICD-10 and OPCS codes will group to a HRG but you need to determine the dominant one. The code to grouper user manual includes the following useful flow chart:

So in this case the total excision of a kidney has a hierarchy of 10 which is greater than 3 and so will drive the choice of HRG. Here are the relevant screens from NHS Ref, all accessed initially from the the OPCS screen.

You can now complete the coding document to determine the income generated by the provider trust.

Reference NHS - TheJcs

{ 1 comment }

I use pivot tables a lot. They drive multiple reports in my workbooks and it is not unusual for me to have 6 or so pivots in each spreadsheet. When the underlying data changes I spend quite a lot of time clicking through each pivot table to refresh it.

Refresh All Button to refresh all pivots at once

I was happy to discover the Refresh All button in excel 2007 today. Press it and all your pivots will refresh as if by magic!

You’ll find it on the Data ribbon.

Refreshing All Pivots using Macro Code

I also have a number of macro enabled workbooks that generate reports and update pivot tables. I have been writing code that specifies the pivot table by name and then refreshes it but it can be tricky to find the pivot table name. You can use refresh all within the code to refresh all pivot tables at once.

Try:

ActiveWorkbook.RefreshAll

or

Workbooks(1).RefreshAll

Originally published on my excelpivots blog.

{ 0 comments }

Macro to Insert a "Last Update On:" Date Value

December 19, 2011
Macro to Insert a "Last Update On:" Date Value

I have a number of summary reports that are updated by the use of a macro update button. I usually include an area on the report to include a datestamp that indicates when the report was last updated or refreshed. This is the code I use: Dim VERSIONDATE As String VERSIONDATE = Format(Now, "dd-mm-yy hh-mm-ss") [...]

Read the full article →

Sumproduct Function and Conditional Criteria

December 7, 2011
Sumproduct Function and Conditional Criteria

The sumproduct function is one of the most useful and powerful features of excel but the online help available from microsoft is unfortunately pretty poor. It works as an advanced form of the SUMIF formula where you can sum a range of values based on a number of different criteria. Here is the simple formula [...]

Read the full article →

Macro to Consolidate Data Ranges from Multiple Excel Spreadsheets

December 4, 2011
Macro to Consolidate Data Ranges from Multiple Excel Spreadsheets

I have a lot of macro workbooks that consolidate multiple ranges from individual tabs in the same workbook into one summary tab but today I discovered how to consolidate ranges from different excel workbooks into one summary workbook. My task today was to consolidate one of the worst NHS SLAM reports that we receive. This [...]

Read the full article →

The SUMIF Function using Named Ranges

December 2, 2011
The SUMIF Function using Named Ranges

Named ranges are useful in Excel as their use can make formulas much more understandable. You can name: areas of the spreadsheet, such as your data columns and rows or individual cells that may act as constants, for example the VAT value I’ve used the following named ranges in the SUMIF example below: As usual [...]

Read the full article →