Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

edac-util(1) [debian man page]

EDAC-UTIL(1)						   EDAC error reporting utility 					      EDAC-UTIL(1)

NAME
edac-util - EDAC error reporting utility. SYNOPSIS
edac-util [OPTION]... DESCRIPTION
The edac-util program reads information from EDAC (Error Detection and Correction) drivers in the kernel, using files exported by these drivers in sysfs. With no options, edac-util will report any uncorrected error (UE) or corrected error (CE) information recorded by EDAC, along with any DIMM label information registered with EDAC. OPTIONS
-h, --help Display a summary of the command-line options. -q, --quiet Quiet mode. For some reports, edac-util will report corrected and uncorrected error counts for all MC, csrow, and channel combina- tions, even if the current count of errors is zero. The --quiet flag will suppress the display of any locations with zero errors, thus creating a more terse report. No output will be generated if there are zero total errors currently recorded by EDAC. Addition- ally, the use of --quiet will suppress all informational and debug messages, displaying only fatal errors. -v, --verbose Increase verbosity. Multiple -v's may be used. -s, --status Displays the current status of EDAC drivers. edac-util will report whether it detects that EDAC drivers are loaded, and the number of memory controllers (MCs) found in sysfs. In verbose mode, the MC id and name of each controller will also be printed. -r, --report=report,... Specify the report to generate. Currently, the available reports are default, simple, full, ue, and ce. These reports are detailed in the EDAC REPORTS section below. More than one report may be specified in a comma-separated list. EDAC REPORTS
default The default edac-util report is generated when the program is run without any options. If there are no errors logged by EDAC, this report will display "No errors to report." to stdout. Otherwise, error counts for each MC, csrow, channel combination with attrib- uted errors are displayed, along with corresponding DIMM labels, if these labels have been registered in sysfs. The default report will also display any errors that do not have any DIMM information. These errors occur when errors are reported in the memory controller overflow register, indicating that more than one error occurred during a given EDAC poll cycle. It is usu- ally obvious from which DIMM locations these errors were generated. simple The simple report reports total corrected and uncorrected errors for each MC detected on the system. It also displays a tally of total errors. With the --quiet option, only non-zero error counts are displayed. full The full report generates a line of output for every MC, csrow, channel combination found in EDAC sysfs. This includes counts of errors with no information ("noinfo" errors). Output is of the form: MC:(csrow|noinfo):(label|all):(UE|CE):count With the --quiet option, only non-zero error counts will be displayed. ue This report simply displays the total number of Uncorrected Errors (UEs) detected on the system. With the --quiet option, output will be suppressed unless there are 1 or more errors to report. ce This report simply displays the total number of Corrected Errors (CEs) detected on the system. With the --quiet option, output will be suppressed unless there are 1 or more errors to report. SEE ALSO
edac(3), edac-ctl(8) edac-utils-0.18-1 2011-11-09 EDAC-UTIL(1)

Check Out this Related Man Page

EDAC(3) 						   EDAC error reporting library 						   EDAC(3)

NAME
libedac - EDAC error reporting library SYNOPSIS
#include <edac.h> cc ... -ledac edac_handle * edac_handle_create (void); void edac_handle_destroy (edac_handle *edac); int edac_handle_init (edac_handle *edac); unsigned int edac_mc_count (edac_handle *edac); int edac_handle_reset (edac_handle *edac); int edac_error_totals (edac_handle *edac, struct edac_totals *totals); edac_mc * edac_next_mc (edac_handle *edac); int edac_mc_get_info (edac_mc *mc, struct edac_mc_info *info); edac_mc *edac_next_mc_info (edac_handle *edac, struct edac_mc_info *info); int edac_mc_reset (struct edac_mc *mc); edac_csrow * edac_next_csrow (struct edac_mc *mc); int edac_csrow_get_info (edac_csrow *csrow, struct edac_csrow_info *info); edac_csrow * edac_next_csrow_info (edac_mc *mc, struct edac_csrow_info *info); const char * edac_strerror (edac_handle *edac); edac_for_each_mc_info (edac_handle *edac, edac_mc *mc, struct edac_csrow_info *info) { ... } edac_for_each_csrow_info (edac_mc *mc, edac_csrow *csrow, struct edac_csrow_info *info) { ... } DESCRIPTION
The libedac library offers a very simple programming interface to the information exported from in-kernel EDAC (Error Detection and Correc- tion) drivers in sysfs. The edac-util(8) utility uses libedac to report errors in a user-friendly manner from the command line. EDAC errors for most systems are recorded in sysfs on a per memory controller (MC) basis. Memory controllers are further subdivided by csrow and channel. The libedac library provides a method to loop through multiple MCs, and their corresponding csrows, obtaining informa- tion about each component from sysfs along the way. There is also a simple single call to retrieve the total error counts for a given machine. In order to use libedac an edac_handle must first be opened via the call edac_handle_create(). Once the handle is created, sysfs data can be loaded into the handle with edac_handle_init(). A final call to edac_handle_destroy() will free all memory and open files associated with the edac handle. edac_handle_create() will return NULL on failure to allocate memory. The edac_strerror function will return a descriptive string representation of the last error for the libedac handle edac. The edac_error_totals() function will return the total counts of memory and pci errors in the totals structure passed to the function. The totals structure is of type edac_totals which has the form: struct edac_totals { unsigned int ce_total; /* Total corrected errors */ unsigned int ue_total; /* Total uncorrected errors */ unsigned int pci_parity_total; /* Total PCI Parity errors */ }; MEMORY CONTROLLER INFORMATION
Systems may have one or more memory controllers (MCs) with EDAC information. The number of MCs detected by EDAC drivers may be queried with the edac_mc_count() function, while the edac_next_mc function will return a handle to the next memory controller in the libedac han- dle's internal list. This memory controller is represented by the opaque edac_mc type. edac_next_mc will return NULL when there are no further memory controllers to return. Thus the following example code is another method to count all EDAC MCs (assuming the EDAC library handle edac has already been initialized): int i = 0; edac_mc *mc; while ((mc = edac_next_mc (edac))) i++; return (i); To query information about an edac_mc, use the edac_mc_get_info function. This function fills in the given info structure, which is of type edac_mc_info: struct edac_mc_info { char id[]; /* Id of memory controller */ char mc_name[]; /* Name of MC */ unsigned int size_mb; /* Amount of RAM in MB */ unsigned int ce_count; /* Corrected error count */ unsigned int ce_noinfo_count;/* noinfo Corrected errors */ unsigned int ue_count; /* Uncorrected error count */ unsigned int ue_noinfo_count;/* noinfo Uncorrected errors*/ }; The function edac_next_mc_info() can be used to loop through all EDAC memory controllers and obtain MC information in a single call. It is a combined edac_next_mc() and edac_mc_get_info(). The function edac_handle_reset() will reset the internal memory controller iterator in the libedac handle. A subsequent call to edac_next_mc() would thus return the first EDAC MC. A convenience macro, edac_for_each_mc_info(), is provided which defines a for loop that iterates through all memory controller objects for a given EDAC handle, returning the MC information in the info structure on each iteration. For example (assuming initialized libedac handle edac): edac_mc *mc; struct edac_mc_info info; int count = 0; edac_for_each_mc_info (edac, mc, info) { count++; printf ("MC info: id=%s name=%s ", info.id, info.mc_name); } CSROW INFORMATION
Each EDAC memory controller may have one or more csrows associated with it. Similar to the MC iterator functions described above, the edac_next_csrow() function allows libedac users to loop through all csrows within a given MC. Once the last csrow is reached, the function will return NULL. The edac_csrow_get_info() function returns information about edac_csrow in the edac_csrow_info structure, which has the contents: struct edac_csrow_info { char id[]; /* CSROW Identity (e.g. csrow0) */ unsigned int size_mb; /* CSROW size in MB */ unsigned int ce_count; /* Total corrected errors */ unsigned int ue_count; /* Total uncorrected errors */ struct edac_channel channel[EDAC_MAX_CHANNELS]; }; struct edac_channel { int valid; /* Is this channel valid */ unsigned int ce_count; /* Corrected error count */ int dimm_label_valid; /* Is DIMM label valid? */ char dimm_label[]; /* DIMM name */ }; The edac_next_csrow_info() function is a combined version of edac_next_csrow() and edac_csrow_get_info() for convenience. The edac_mc_reset() function is provided to reset the edac_mc internal csrow iterator. A convenience macro, edac_for_each_csrow_info(), is provided which defines a for loop that iterates through all csrow objects in an EDAC memory controller, returning the csrow information in the info structure on each iteration. EXAMPLES
Initialize libedac handle: edac_handle *edac; if (!(edac = edac_handle_create ())) { fprintf (stderr, "edac_handle_create: Out of memory! "); exit (1); } if (edac_handle_init (edac) < 0) { fprintf (stderr, "Unable to get EDAC data: %s ", edac_strerror (edac)); exit (1); } printf ("EDAC initialized with %d MCs ", edac_mc_count (edac)); edac_handle_destroy (edac); Report all DIMM labels for MC:csrow:channel combinations edac_mc *mc; edac_csrow *csrow; struct edac_mc_info mci; struct edac_csrow_info csi; edac_for_each_mc_info (ctx->edac, mc, mci) { edac_for_each_csrow_info (mc, csrow, csi) { char *label[2] = { "unset", "unset" }; if (csi.channel[0].dimm_label_valid) label[0] = csi.channel[0].dimm_label; if (csi.channel[1].dimm_label_valid) label[1] = csi.channel[1].dimm_label; printf ("%s:%s:ch0 = %s ", mci.id, csi.id, label[0]); printf ("%s:%s:ch1 = %s ", mci.id, csi.id, label[1]); } } SEE ALSO
edac-util(8), edac-ctl(8) @META_ALIAS 2011-11-09 EDAC(3)
Man Page