NCO Homepage

Known Problems through 2012 (version 4.2.3)

Older Generic Run-time Problems:

MM3 slowdown: A longstanding “feature” of netCDF3 was identified in March, 2012, and is now known by the tag MM3. The MM3 issue can lead to unusually slow performance. The problem is triggered by an aggregate pattern of file access so the workaround must be implemented in the application software (e.g., NCO) rather than in the netCDF library itself. The name MM3 fits because the problem is normally encountered on Multi-record Multi-variable netCDF3 files. And we call our “solution” the MM3-workaround. If you encounter unusually slow NCO performance while using NCO to analyze MM3 files on a large blocksize filesystem, chances are you are encountering an MM3-induced slowdown. NCO release 4.1.0 implements the MM3-workaround for ncks. It speeds-up common ncks sub-setting on NCAR's GLADE by 10-50x. MM3-induced slowdowns are present in other NCO operators and we are prioritizing our MM3-patches to those encountered most often. Thanks to Gary Strand for reporting this problem, and to Russ Rew for creating the workaround algorithm, which is also now in nccopy.

NOFILL bug: All netCDF versions prior to 4.1.3 may create corrupt netCDF3 files when linked to any version of NCO except 4.0.8. The solution is to install netCDF version 4.1.3 or later. The corruption occurs silently (without warning or error messages). The problem has been seen "in the wild" only on filesystems with large block sizes (e.g., Lustre), although it may be more widespread. It is caused by a netCDF bug that NCO triggers by invoking NOFILL mode for faster writes. Hence it is called the NOFILL bug. The bug is hard to trigger, it depends on a rare interaction of filesystem block-size, hyperslab size, and order-of-variable writing. The bug exists in all versions of netCDF through 4.1.2. If you have a large block filesystem and cannot upgrade your netCDF library, then use NCO version 4.0.8, which disables NOFILL mode (and thus writes files more slowly). NCO 4.0.8 and will workaround the NOFILL bug on all versions of netCDF (i.e., 4.1.2 and earlier). Hence NCO 4.0.8 will always correctly write netCDF3 files. Other temporary workarounds include creating only netCDF4 files (e.g., ncks -4 ...) instead of netCDF3 files. The NOFILL patch included in NCO 4.0.8 was subsequently removed in NCO 4.0.9, which assumes that netCDF 4.1.3 or later is installed.

Degenerate hyperslabbing bug: Versions ???—4.0.6 could return incorrect hyperslabs when user-specified hyperslabs did not include at least one point. In such cases, instead of returning no data, hyperslabs could return all data. To determine whether your NCO is affected by this bug, run these commands: ncks -O -v lat -d lat,20.,20.001 ~/nco/data/in.nc ~/foo.nc;ncks -H ~/foo.nc If the returned hyperslab contains any data, then your NCO is buggy (because that hyperslab should be empty). This can lead to incorrect answers for hyperslabs that should be empty. Analogous problems would occur with empty auxiliary coordinate bounding boxes. Although most users do not specify empty hyperslabs, we urge all users to upgrade to NCO 4.0.7+ just to be safe.

Threading problems with MSA: NCO version 3.9.5 has a nasty bug that causes threaded arithmetic operators, e.g., nces to produce incorrect results under some conditions. The problem may occur whenever OpenMP is enabled and the operators run on a multi-core CPU with more than one thread. These incorrect answers, if generated, are relatively easy to notice. The number of threads used to generate a file is, by default, recorded in the global attribute nco_openmp_thread_number which may be examined with ncks -M foo.nc | grep nco_openmp_thread_number. The only action that will correct a file that you think (or know) contains corrupted data because of this NCO bug is to re-process the file with a non-buggy NCO version. Version 3.9.5 is buggy and should be upgraded ASAP. Be careful with data processed using this NCO version on multi-core CPUs. The (one-line!) patch to fix this bug in 3.9.5 is here.

Index-based hyperslab problems: NCO versions 2.7.3—2.8.3 have a nasty bug that causes index-based hyperslabs, e.g., -d lat,1, to behave like value-based hyperslabs, e.g., -d lat,1.0 under some conditions. Unfortunately, the incorrect answers generated may be hard to notice! This problem was most often enountered by users trying to assemble monthly averages using the stride feature of ncrcat. One common symptom is that the time-offset of the output file is incorrect. Versions 2.7.3—2.8.3 are buggy and should be upgraded ASAP. Re-do any data-processing that used index-based hyperslabbing with these versions of NCO.

Older Operator-specific Run-time Problems:

ncks bug with auxiliary coordinates: Versions 4.2.x–4.3.1 of ncks did not correctly support auxiliary coordinates (specified with -X). Auxiliary coordinates continued to work with the other hyperslabbing NCO operators. Auxiliary coordinates once again work in all hyperslabbing operators, including on netCDF4 group files in operators that support them. Fixed in version 4.3.2.

ncatted bug on implicit attribute names: Versions 4.2.x–4.3.0 of ncatted could segfault when processing attributes specified implicitly (i.e., by leaving the attribute field blank in the -a specification. Fixed in version 4.3.1.

ncbo bug handling certain special variables: Version 4.3.0 of ncbo inadvertently always turns off certain exceptions to variable list processing. This may cause some grid-related variables (e.g., ntrm and nbdate) and some non-grid variables (e.g., ORO and gw) to be arithmetically processed (e.g., subtracted) even when that makes no sense in most climate model datasets. Fixed in version 4.3.1.

ncks bug copying metadata: Version 4.2.6 of ncks does not copy variable metadata by default. Thus output files appear stripped of metadata. One can work around this problem in 4.2.6 by specifying the -m option. Otherwise an upgrade is recommended. Fixed in version 4.3.0.

ncks bug subsetting variables: Version 4.2.4 of ncks sometimes dumps core when subsetting variables with -v var. Fixed in version 4.2.5.

ncks bug with altering record dimensions: Version 4.2.4 of ncks ignored both the --mk_rec_dmn and the --fix_rec_dmn switches. It exited successfully without altering the record variable. Fixed in version 4.2.5.

nces bug with non-record files: Versions 4.2.1—4.2.3 of nces incorrectly referenced the record variable on files which do not contain it. This caused a segmentation violation and core dump.

ncra bug when last file(s) is/are superfluous: Versions 4.2.1—4.2.3 of ncra incorrectly skipped writing the results of the final normalization when trailing files were superfluous (not used). In the most common case, all values are zeros in the output file. Upgrade if you call ncra with trailing superfluous files.

ncecat bug when files generated with -n: Version 4.2.2 of ncecat could incorrectly skip the first input file in the default mode (RECORD_AGGREGATE) when the -n NINTAP switch is used to automate filename generation. Upgrade if you use ncecat -n.

ncra bug handling CF coordinates attributes that contain the name of the record coordinate: Versions 4.0.3—4.0.4 of ncra incorrectly treat the record variable (usually time) as a fixed variable if it is specified in the coordinates attribute of any variable in a file processed with CCM/CCSM/CF metadata conventions. This bug caused core dumps, and even weirder behavior like creating imaginary time slices in the ouput. Upgrade recommended if you work with NCAR CCSM/CESM model output. One workaround that does not require NCO upgrades is to remove the record coordinate name (usually time) from the coordinates attribute of all variables in CF-compliant files before processing the file with ncra.

ncra bug averaging YYYYMMDD-format date variables in CCSM/CF-compliant files: Versions ???—4.0.5 of ncra contain a bug which produces an incorrect average (usually zero) of the date variable which many CCSM/CF-compliant files use to track model dates in the human-readable YYYYMMDD-format. Averaging YYYYMMDD-format integers is intrinsically difficult, since such dates have calendar assumptions built-in. NCO attempts this in CCSM/CF-compliant files by using the nbdate (beginning date) and time (days since nbdate) variables to find the average date, converting that to YYYYMMDD, and writing that as the average value of date.

ncks bug hyperslabbinging fixed netCDF4 dimensions: Versions 4.0.3—4.0.4 of ncks contain a bug which triggers a core-dump when hyperslabbing (along a non-record dimension) a netCDF4-format input file into a netCDF4-format output file, e.g., ncks -d 0,1,lat in4.nc out4.nc. Three workarounds that do not require NCO upgrades (or downgrades) are to explicitly specify chunking with, e.g., ncks --cnk_plc=all -d 0,1,lat in4.nc out4.nc, or, to use nces instead of ncks for hyperslabbing, e.g., nces -d 0,1,lat in4.nc out4.nc (nces does a no-op when there is only one input file), or to write to a netCDF3 file, ncks -3 -d 0,1,lat in4.nc out3.nc.

Core dump with ncks: Printing variables to screen with ncks can trigger a segfault in NCO 3.9.9—4.0.3. Users may upgrade, downgrade, or apply this one-line patch to 3.9.9 sources: Remove this line “*cnk_sz=(size_t)NULL;” —near line 751 of nco/src/nco/nco_netcdf.c— should fix the problem. The problem in later NCO versions is due to a different bug and this patch will not work.

ncrename erroneous error exit: Versions 4.0.1—4.0.3 of ncrename contain a bug where commands like ncrename -a .old_nm,new_nm in.nc out.nc would, if old_nm did not exist, write the correct file and then exit with an error message although no error had occurred. The files written were fine, and the error message can be safely ignored. This was due to not clearing an extraneous return code.
ncbo segmentation fault: ncbo versions 4.0.0—4.0.2 incorrectly refreshed internal metadata, leading to segmentation faults and core dumps with some exacting compilers, notably xlC on AIX.
ncra segmentation fault: ncra versions 4.0.0—4.0.1 mishandled some CF-compliant dates, leading to segmentation faults and core dumps.
Arithmetic problems with ncap division, modulo, and exponentiation: ncap versions < 3.0.1 incorrectly exponentiate variables to variable powers (V^V). We recommend that all ncap users upgrade.
ncap versions up to 2.9.1 incorrectly handle division, modulo, and exponentiation operations of the form S/V, S%V, and S^V where first operand (S) is scalar (i.e., either typed directly in the ncap script or converted from an attribute) and the second operand (V) is a full variable (i.e., stored in a file or computed by ncap). Instead of the requested quantity, ncap returned V/S, V%S, and V^S. In other words ncap treated some non-commutative operations as commutative. This is now fixed. The V/V, V%V, V^V, V/S, V%S, V^S, S/S, S%S, and S^S operations were never affected. We recommend that all ncap users upgrade.

Incorrect ncbo output for packed input: ncbo versions ???—3.2.0 incorrectly write differences of packed input. This only affects packed variables.
Problems with ncflint and missing_values: The algorithm ncflint used to perform interpolation in versions up to 2.9.4 was not commutative. It returned the weighted valid datum when the other datum was missing_value, or it returned missing_value, depending on the order the input files were specified. As of version 2.9.5, ncflint always returns missing_value when either input datum is missing_value. Possible future implementations are discussed here.
Problems with ncra and nces when missing_value = 0.0: The algorithm ncra and nces used to perform arithmetic in versions up to 2.9.2 breaks if missing_value is 0.0. Why, you ask? Running average (or total, etc.) algorithms must initialize the answer to 0.0. This is done since the sum accumulates in place as ncra and nces proceeds across records and files. (Normalizing this accumulation by the total number of records is the last step). The old algorithm compared both the current running average and the new record to the missing_value. If either comparison matched, then nothing accumulated for that record. This zero-initialization led to a state where it was impossible to ever recognize valid data. As a result nothing accumulated and the answer was always zero. The record and ensemble averages would also fail (in a non-obvious) way whenever an intermediate sum equalled missing_value. The chances of the latter event ever happening are exceedingly remote. The new algorithm compares only the new record to the missing_value. This fixes both problems and is faster, too.
Packing problems with ncwa: NCO versions ???—2.9.0 have a bug that causes ncwa to fail (produce garbage answers) when processing packed NC_FLOAT data. Version 2.9.1 fixes this problem. This problem may have been noticed most by OPeNDAP users since many netCDF climate datasets served by OPeNDAP are packed NC_FLOATs. Upgrade to 2.9.1 if you use ncwa on packed data.

Packing problems with ncap: NCO versions 2.8.4—2.8.6 have a bug that causes the ncap intrinsic packing function pack() to fail. Version 2.8.7 fixes this problem.

Older Platform-specific Run-time Problems:

Float-valued intrinsic arithmetic functions in ncap on AIX: ncap versions through 4.0.4 have a bug that causes all float-valued intrinsic math functions to fail under AIX. Float-valued math functions are the ISO C99 functions, e.g., cosf(), fabsf(), logf(). The user does not invoke these functions directly— the user always specifies the generic function name, e.g., cos(), abs(), log(). NCO automatically calls the native single precision (i.e., float-valued) math functions when the generic function argument is a native float (e.g., naked constants like 1.0f or variables stored as NC_FLOAT). Double precision arguments cause NCO to invoke the standard (double-valued) form of the generic function, e.g., cos(), fabs(), log().