|
|
|
Step-By-Step Guide
Resources
Step-By-Step Guide
Task 2.2: Prepare for Data Analysis
You have been asked to perform three tasks: create a line listing, create an epidemiological curve (“epi curve” for short), and provide a Descriptive Epidemiology Report (DER). These analyses will give you a sense of what you can learn from the initial data, and help you to determine the next steps in the investigation.
The steps below will walk you through this process.
- Review the email from Dr. Lyons to confirm your understanding of the task.
- Download and review the documents attached to Dr. Lyons’ email (Questionnaire v1, EpiCS Alert) both to see what is there and to get a feel for what resources you have available to you in this task.
- Each team member should download a copy of Questionnaire v1 and keep his/her own notes. Each of you may notice different things, which will be helpful to share with each other and discuss.
Identify Variables for the Line Listing
The line listing is a tool that can help the investigation team summarize what is known so far about the case-patients and also uncover specific clues as to the source of the outbreak. Even with only 15 case-patients, with a 17-item questionnaire it would be difficult to scan the results and come to any accurate conclusions about the data you have collected. The line listing is a table that summarizes some of the key data points, called “variables,” that have been collected with the lengthier questionnaire and from other sources.
- As you go through this process, be sure to utilize Developing a Line Listing and other information available in the Resources link (above).
- Review Questionnaire v1 to determine the key questions in each section.
- A question could be categorized as key if it is related to hypotheses about the source of the outbreak or would help you to identify a ‘hot spot,’ or area of possible exposure. One example of a hot spot would be a petting zoo: you might want to know if all the case-patients had been to a petting zoo within the 6 months prior to the appearance of symptoms. This would identify a pattern for you and might prompt you to want to find out more details about those visits in the future. In this case, you might select “Visit to petting zoo Y/N” as a possible variable.
- Some of the questions in Questionnaire v1 may be related to hypotheses that your team had not considered in task 2.1. You may need to consider additional hypotheses when developing the line listing.
- Open-ended questions in a questionnaire can be summarized easily if distilled to a binary form. For example, an open-ended menu question in a food borne investigation can be distilled to “Chicken Y/N.”
- Alternatively, if the need arises you can expand a single question within a questionnaire into multiple columns. One place this might happen is with open-ended questions that can have multiple answers.
In this case, you would still want to retain the binary column to facilitate the ability to sort by Yes or No, but also include a column for the open-ended response. For example, with travel to a petting zoo, you could have the binary Y/N column, and then also a column to indicate which zoo they visited.
- Once you identify a particular question to be included, develop a variable from that question into the easiest version to facilitate analysis. The goal in creating the variable is to make analyzing the data as easy and as informative as possible.
- Use a binary (yes/no) form of the variable whenever possible. For example, if the question is “Travel history in the past 6 months,” the binary form could be “Travel to endemic area Y/N.”
- However, sometimes a binary variable does not capture all of the information you might eventually need (such as when the travel to the endemic area occurred), so you might either need to create multiple variables to capture all of the relevant information, or you might choose a different form of the variable.
- For example, in the questionnaire where the respondent is asked about the patient’s exposure to soil or dust, you might want to include possible categories that would indicate another possible source besides gardening for the exposure so they are counted and reflected in your line listing.
- In some instances a column with the binary yes/no information may be followed by a column(s) where the specific answer(s) can be listed when the answer is “Yes”. For example, the column for the variable “Travel to endemic area Y/N” can be followed by a column “Endemic area travel location.”
Tip: By having a binary column AND an open-ended column instead of a single column, the line listing is easier to analyze. The binary column case be used to sort the line listing if needed while the open-ended column lets people view the response details without needed to check individual questionnaires.
- For open ended questions that are not meaningful when condensed into yes/no responses even with follow-up as described above, a column for the open question can be included. Note that it is still important to condense the response as much as possible to make for review the line listing as easy as possible.
- When your team has drafted the variables for the line listing, set it aside in preparation for review with your mentor.
- Please note: It is not necessary to transcribe or fill in the data from the 15 case-patient profiles once you have identified the variables. This will be done once the line listing is finalized.
Create an Epi Curve
The epi curve shows, in a simple visual format, the evolution of the outbreak: how many known cases there are and when they appeared. This is an example of a histogram. The shape of the curve can suggest the pattern of spread for an outbreak. Use Microsoft Word or Excel to create your epi curve.
- View the example Epi Curves and other information in Resources link (above).
- The x-axis in this chart measures timing and is often the date of onset of illness among cases, and the y-axis is the number of cases.
- Cases (represented by boxes) are placed along the x-axis by a specific time scale, in this case by month.
- Start by labeling your Epi Curve with an appropriate title. An Epi curve is incomplete without a title.
- Using the data provided to you in the 14 New Case-Patient Profiles in Task 2.1, choose the length of the time intervals to be used for your x-axis. Discuss with your team what period of time would be meaningful to you in this investigation. Fill in this information along the x-axis of the table.
- Consider the incubation period of the pathogen in your investigation, if it is known.
- If the pathogen incubates every 30 days, then a time scale measured in hours is going to give you many blank time slices that don’t tell you very much.
- For the same pathogen, a time scale measured in years will lump all of your cases together into the same time slice, so you won’t be able to see if those cases developed at different times.
- A general rule is to choose your time scale to be one unit “smaller” than the incubation period of the pathogen. If the incubation period is measured in weeks, then scale your epi curve in days; if the incubation period is measured in days, then scale your epi curve in hours.
- Label the x-axis. A general rule of thumb is to start the x-axis one or two units before the earliest known case. As more cases are discovered the time scale for the x-axis may need to change when you update the epi curve; even if no cases are discovered it shows that no previous cases existed before the first case in the investigation.
- Label the y-axis with the numbers of cases to be counted during each interval. Choose the y-axis scale based on what you know of the size of the outbreak; the larger the outbreak, the larger your scale needs to be on the y-axis to accommodate all of the case patients. Unless you have an unusually large number of case patients initially, a scale of one (numbering each case-patient individually along the y-axis) is appropriate.
- Please note: The y-axis should be on the far left of the graph. Remember that the negative numbers used for the "Date" in this rotation are due to the mechanics of the rotation. In a real investigation the labeling on the x-axis would be dates starting as mentioned above and would not contain negative numbers.
- Draw a box for each known case-patient in the time interval during which that case-patient became ill.
- The scale along the y-axis will dictate how large to draw the box; if you have chosen a scale of 1 for the y-axis, for example, then the box would be drawn as tall as the “1” on the y-axis. With a larger y-axis scale, you should reduce the size of the box you draw accordingly.
- Make sure to use the same criteria to determine the proper time of illness onset for each case-patient or the curve may end up being inaccurate and will be misleading. This could include time of symptom onset for the patient, when the patient first went to the doctor, etc. How might using a doctor visit as the criteria affect the curve and the investigation of the source of the outbreak when some case-patients go to the doctor at the first sign of being ill and other case-patients wait until they are so ill they can barely function before going to the doctor? The criteria should take into consideration the time-interval for the disease as well. How would inaccuracies of a few days affect an epi curve with a time interval in months compared to weeks or hours?
Trap: There may not be clear solutions to the problems mentioned above. Decisions still need to be made so the investigation can move forward. Do not spend too much time trying to find the perfect solution. Choose the best criteria available and document the decision; if problems arise later then the documentation can help identify potential solutions.
- You may want to make additional distinctions (via color or pattern) for severity of illness or other criteria at your team’s discretion. For example, in a food borne outbreak, information about severity might be able to help you pinpoint the source. At a minimum, you might want to at least know any case-patients who died, as that will be something you want to report on later in the investigation.
- When your team has drafted the epi curve, consider as a group what implications it has for your investigation. Reference the “Epi Curve” section of the Resources (above) to help you understand the basic terms and determine what the shape of the epi curve might tell you about the source of the outbreak.
- The shape of the epi curve may provide clues to the pattern of spread in the population. Those patterns might be:
- Point source (a single event)
- Intermittent source (multiple events)
- Propagated (transmitted person-to-person)
- The curve can also tell you where you are in the course of the outbreak – are you on the upswing? The down slope? This can help you predict whether more cases are anticipated in the next time interval and may help you refine your case definition.
- The curve might also be able to point out outliers, or cases that don’t fit into the body of the curve. These outliers may help you to define your case if you review their information carefully. Outliers can prompt investigators to change the timeframe of their outbreak (increase it), and also reconsider its scope.
- For example, an outbreak that seems fairly well-contained in time on the epi curve—looking like a point-source “bell curve”, but with a couple of outliers on the right may not be over, and might in fact still be ongoing due to propagation that was unanticipated originally. Maybe an outbreak that is well-contained in one city is propagated when a resident gets on a plane and flies to another part of the country, or an outbreak that is related to contaminated peanut butter seems over until undiscovered cases of the contaminated product are distributed by accident (after it’s recalled) and more people get sick.
- In some cases the outliers may not be true case-patients and became ill for reasons other than the source of the outbreak.
Tip: Do consider how using other functions in MS Word and Excel, such as borders, cell sizing, and cell shading, can create a clean, informative epi curve that is easy to modify as new information becomes available.
Trap: Do not limit ideas on how to create the epi curve in a spreadsheet based on the graph/chart functionality of the application. While the chart functions are useful in some instances, they do not serve every purpose.
- When you have finished developing your epi curve, set it aside in preparation for review with your mentor.
Descriptive Epi Report (DER)
A descriptive epi report (DER) provides some basic statistics about the known case-patients, typically used for reporting. For example, what percentage of the case-patients are male vs. female? What percentage died from the infection? Though these numbers do not usually indicate a possible source of the outbreak, they are still needed to communicate to others about the outbreak.
- Review the section on "Descriptive Epi Reports" in the Resources link (above) to get a feel for what information is included in a DER. Consider any summary statistics that you can cull from the Case-Patient Profiles and create those lines in your DER. Some simple examples are:
- What is the ratio of males to females among the case-patients?
- What is the distribution of the age of the case-patients (e.g., mean, median)?
- What are the states of residence of case-patients? What percent of case-patients come from each state?
- What is the range of date/time of illness onset for case-patients?
- Calculate any relevant statistics and record your answers in the DER.
- When you have finished developing your DER, set it aside in preparation for review with your mentor.
Review and submit your work.
- Review your work.
- Did you select a manageable yet comprehensive subset of variables for your line listing?
- Did you determine the implications of the epi curve with respect to the investigation?
- Did you calculate the statistics needed for the descriptive epi?
- Submit your work.
- Review the checklist located in the Submit Your Work section of this task to ensure completion of the task before submitting your deliverables to your mentor.
- Please note: Only one set of deliverables need to be submitted per team. Any additional notes not captured in that set of deliverables should be retained by the team members for possible use in future tasks.
Resources
Task 2.2 Resources
While Wikipedia is a valuable resource, unlike some other websites anyone can contribute to or modify the site (whether they're knowledgeable about the topic or not). As a result, the site is subject to constant change by questionable sources. Be sure to cross-check information on Wikipedia with other reputable sites to ensure accuracy.
Ask the Expert
What is a line listing and why is it important?
What is an epi curve and why is it important?
What is a case definition and why is it important?
Line Listing
Developing a Line Listing
A description of what a line listing is, how to build a line listing, and a list of variables frequently included in a line listings.
Sample Line Listing 1
An example of a line listing in Table 1 used for a Dengue outbreak is about a third of the way down.
Sample Line Listing 2
A line listing developed for an outbreak of gastroenteritis starts on page 7. Consider what type of questionnaire might be developed from the table on page 6 and how the line listing is related to that information.
Epidemiological Curve (Epi Curve)
Sample Epi Curve
An example of an epi curve with a one-month time interval. (Note: The difference between colonization and infection is whether the organism is making the person sick at this time.)
Overview of Epi Curves
From the West Virginia Department of Health and Human Resources. The document contains information on how different shapes of epi curves are related to the source of an outbreak as well as examples of epi curves.
Descriptive Epidemiology Report (DER)
The DER contains statistics regarding person, place, and time of current case-patients including the distribution of cases by age group, state of residence, or time/date of illness onset. Much of this information will be part of the Line Listing (see Developing a Line Listing (above) for a full description).
Sample DER 1: Hospital Admissions
A descriptive epidemiology report regarding fractures and hospital admissions in Australia. The descriptions, tables, and graphs that begin on page 13 are useful examples.
Sample DER 2: Lyme Disease
A descriptive epidemiology report of Lyme disease in Ontario.
Sample DER 3: E. Coli
A descriptive epidemiology report of an E. coli outbreak.
Sample DER 4: Foodborne outbreak
A report on a food borne outbreak including case definitions, an epi curve, and a descriptive epidemiology report. The DER is in the “Results” section on page 2, and in the “Appendix” on page 4.
Coccidioidomycosis (cocci)
Cocci Overview from CDC
Information from the Centers for Disease Control and Prevention (CDC) on cocci clinical features, transmission, risk groups, challenges, and other important info.
Cocci Overview 1
Includes common symptoms as well as tests and exams to detect the disease.
Cocci Overview 2
Includes x-rays of a cocci patient.
Cocci Overview 3
Includes a map of endemic areas (from Wikipedia).
Cocci Overview 4
Includes incubation periods, clinical signs, and communicability in both humans and animals from The Center for Food Security and Public Health at Iowa State University.
|
|
|
|