cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
JMP is taking Discovery online, April 16 and 18. Register today and join us for interactive sessions featuring popular presentation topics, networking, and discussions with the experts.

Practice JMP using these webinar videos and resources. We hold live Mastering JMP Zoom webinars with Q&A most Fridays at 2 pm US Eastern Time. See the list and register. Local-language live Zoom webinars occur in the UK, Western Europe and Asia. See your country jmp.com/mastering site.

Choose Language Hide Translation Bar
Identifying Unusual Patterns that Might Indicate Data Integrity Issues

 

 

See how to:

  • Explore Patterns
  • Identify duplicate values
    • Find Most Duplicated Values - values that appear most frequently within column                                         
    • Find Longest Runs - values that repeats in consecutive rows within column
    • Find Longest Duplicated Sequences- sequence of values that repeats within column
    • Find Duplicates Across Columns- sequence of values that appears in the same rows across multiple columns
    • Use Rarity Score to interpret duplications
      • Conceptually a pattern is about as likely as getting [rarity value] heads in a row when flipping a fair coin
      • Statistically, -Log2(p);  where p is probability of pattern assuming random ordering of values                                                            
  • Identify unusual values                                                          
    • Locate Formatted Width within cells - both overall and decimals
    • Locate suspicious Fraction Lengths        
    • Locate suspicious Leading Digits that are too uniform
      • Check distribution of leading digits against Benford's Law, which says, that in many naturally occurring groups of numbers, distribution of leading digit is not uniform
      • Log10( (d+1) / d), where d is leading digit 

  • Identify unexpected linear relationships where, within some group of consecutive rows (default is 10), one column has an exact linear relationship with another column
  • Identify specification limit anomalies for columns with spec limit properties
    • Locate Spec Limit Matches where limits in cells exactly match LSL or USL
    • Compare Spec Limits Distribution to compare out-of-spec values to expected out-of-spec values

Explore Patterns.JPG

 

Benford's Law.JPG

Resources:

Recommended Articles