Using neat

Once you have installed neat and checked that everything is working properly, you can start feeding your own data into the code. Our philosophy for neat is that abundance determinations should be as objective as possible, and so user input should be minimised. A number of choices can be made by the user, but the default options should yield meaningful results, and therefore getting going should be straightforward.

Preparing the input file

Format

neat requires as input a plain text file containing a list of emission line fluxes. The file can contain two, three or four columns. If two columns are found, the code assumes they contain the the laboratory wavelength of the line (λ₀) and its flux (F). Three columns are assumed to be λ₀, F, and the uncertainty on F (ΔF). Four columns are assumed to be the observed wavelength of the line, λ_obs, λ₀, F, and ΔF. The rest wavelengths should correspond exactly to those listed in the file utilities/complete_line_list. The flux column should be a flux per unit wavelength (any units are fine), and the uncertainty on the flux should be given in the same units. Examples can be found in the example/ directory.

Rest wavelengths

neat identifies lines by comparing the quoted wavelength to its list of rest wavelengths. However, rest wavelengths of lines can differ by up to a few tenths of an Angstrom depending on the source. Making sure that neat is identifying your lines properly is probably the most important step in using the code, and misidentifications are the first thing to suspect if you get unexpected results. To assist with preparing the line list, the command line option -id can be used. This applies a very simple algorithm to the input line list to determine their rest wavelengths, which works as follows:

A reference list of 10 rest wavelengths of common bright emission lines is compared to the wavelengths of the 20 brightest lines in the observed spectrum
Close matches are identified and the mean offset between observed and rest wavelengths is calculated
The shift is applied, and then an RMS scatter between shifted and rest wavelengths is calculated
This RMS scatter is then used as a tolerance to assign line IDs based on close coincidences between shifted observed wavelengths and the full catalogue of rest wavelengths listed in utilities/complete_line_list

The routine is not intended to provide 100% accuracy and one should always check very carefully whether the lines are properly identified, particularly in the case of high resolution spectra.

Line blends

In low resolution spectra, lines of comparable intensity may be blended into a single feature. These can be indicated with an asterisk instead of a flux in the input line list. Currently, neat has only limited capabilities for dealing with blends: lines marked as blends are not used in any abundance calculations, and apart from a few cases, it assumes that all other line fluxes represent unblended or deblended intensities. The exceptions are some collisionally excited lines which are frequently blended, such as the [O II] lines at 3727/3729Å. In these cases the blended flux can be given with the mean wavelength of the blend, and the code will treat it properly. These instances are indicated in the utilities/complete_line_list file by a "b" after the ion name.

The uncertainty column

The uncertainty column of the input file is of crucial importance if you want to estimate uncertainties on the results you derive. Points to bear in mind are that the more realistic your estimate of the line flux measurement uncertainties, the more realistic the estimate of the uncertainties on the results will be, and that in all cases, the final reported uncertainties are a lower limit to the actual uncertainty on the results, because they account only for the propagation of the statistical errors on the line fluxes and not on sources of systematic uncertainty.

In some cases you may not need or wish to propagate uncertainties. In this case you can run just one iteration of the code, and the uncertainty values are ignored if present.

Running the code

Assuming you have a line list prepared as above, you can now run the code. In line with our philosophy that neat should be as simple and objective as possible, this should be extremely straightforward. To use the code in its simplest form on one of the example linelists, you would type

% ./neat -i example/ngc6543_3cols.dat

This would run a single iteration of the code, not propagating uncertainties. You'll see some logging output to the terminal, and the calculated results will have been written to the file example/ngc6543_3cols.dat_results. If this is all you need, then the job's done and you can write a paper now.

Your results will be enhanced greatly, though, if you can estimate the uncertainty associated with them. To do this, invoke the code as follows:

% ./neat -i example/ngc6543_3cols.dat -u

The -u switch causes the code to run 20,000 times. In each iteration, the line flux is drawn from a normal distribution with a mean of the quoted flux and a standard deviation of the quoted uncertainty. Unless the -norp option is specified, then for lines with a signal to noise ratio of less than 6, the line flux is drawn from a log-normal distribution which becomes more skewed the lower the signal to noise ratio is. This corrects the low SNR lines for the upward bias which occurs in their measurement. The full procedure is described in Wesson et al. (2012).

By repeating this randomisation process lots of times, you build up a realistic picture of the uncertainties associated with the derived quantities. The more iterations you run, the more accurate the results; 20,000 is a sensible number to achieve well sampled probability distributions. If you want to run a different number of iterations for any reason, you can use the -n command line option to specify your preferred value

Advanced usage

Sometimes you need a little bit more control over what the code is doing. You can use a different extinction law, and if necessary, you can set the values of the extinction, temperatures and densities or any combination thereof which the code will use to calculate abundances. The full list of command line options is as follows:

Option	Default value	Details
-i --input	-	File name of the plain text file containing the measured line fluxes
-n --n-iterations	1	The number of iterations. Any positive integer is accepted; if the value is 1, then a standard analysis with no uncertainty propagation is carried out. If the value is greater than one, then the code will also calculate uncertainties using a Monte Carlo technique. The higher the number of iterations, the better sampled the uncertainty distribution of the output parameters will be. At least 10,000 iterations is recommended. See Wesson et al, 2012, for a full discussion of this technique.
-u --uncertainties	-	Equivalent to `-n 20000`
-e --extinction-law	How	The interstellar extinction law to be used to deredden the data. The laws currently available are: How: Galactic law of Howarth (1983, MNRAS, 203, 301) CCM: Galactic law of Cardelli, Clayton, Mathis (1989, ApJ, 345, 245) Fitz: Galactic law of Fitzpatrick & Massa (1990, ApJS, 72, 163) LMC: LMC law of Howarth (1983, MNRAS, 203, 301) SMC: SMC law of Prevot et al. (1984, A&A, 132, 389)
-c	Calculated	The value of c(Hβ), the logarithmic extinction at Hβ. If no value is given, the extinction is calculated from the ratios of the four brightest hydrogen Balmer lines. If a value is given, then the Balmer line calculation is not done and the user-specified value is adopted.
-nelow -nemed -nehigh	Calculated	The density to be used in the low, medium and high ionisation zones. If any of these options are present, then the code calculates all the diagnostics as normal, but uses the specified value for abundance calculations in the relevant zone. Units: cm^-3
-telow -temed -tehigh	Calculated	The temperature to be used in the low, medium and high ionisation zones. If any of these options are present, then the code calculates all the diagnostics as normal, but uses the specified value for abundance calculations in the relevant zone. Units: K
-he	S96	The atomic data set to use for He⁺ abundances. The available datasets are currently: S96: Smits D.P., 1996, MNRAS, 278, 683 P12: Porter R.L. et al., 2012, MNRAS, 425, 28, 2013, MNRAS, 433, 89
-icf --ionisation-correction-scheme	KB94	The ionisation correction scheme used to derive total elemental abundances. The currently availble schemes are: KB94: Kingsburgh & Barlow, 1994, MNRAS, 271, 257 PT92: Peimbert, Torres-Peimbert & Ruiz, 1992, RMxAA, 24, 155 DI14: Delgado-Inglada et al. (2014)
-v --verbosity	3	Verbosity setting. This option has no effect unless the number of iterations is greater than 1. The following settings can be used: Unbinned and binned results written out for all quantities, as well as summary of all results Only binned results and summary file written out. Only summary file written out.
-id --identify	-	Line identification trigger. This attempts to save you the pain of having to manually identify all the lines in your spectrum. When this option is present on the command line, neat will apply a simple algorithm to identify the lines for you. Output should always be very carefully checked, particularly for deep line lists where automated identification is much less reliable.
-rp	-	When calculating Monte Carlo uncertainties, NEAT's default behaviour is to assume that all uncertainties are Gaussian. If -rp is specified, it will compensate for the upward bias affecting weak lines described by Rola and Pelat (1994), assuming log normal probability distributions for weaker lines. Until version 1.8, the default behaviour was the opposite; log normal probability distributions were assumed unless -norp was specified. This was changed after our discovery that the effect described by Rola and Pelat probably only occurs under extremely specific conditions: see Wesson et al., 2016, MNRAS, in press for details.

Outputs

The code prints some logging messages to the terminal, so that you can see which iteration it is on, and if anything has gone wrong. The results are written to a summary file, and a linelist file, the paths to which are indicated in the terminal output. In the case of a single iteration, these files are the only output.

If you have run multiple iterations, the code also writes all the results for every quantity to files, and then processes those results to extract useful quantities from the probability distributions for each quantity. It first bins the results and writes the binned results to file. By setting the -v option you can suppress the output of binned and unbinned results.

Normality test

The code now applies a simple test to the probability distributions to determine whether they are well described by a normal, log-normal or exp-normal distribution. The test applied is that the code calculates the mean and standard deviation of the measured values, their logarithm and their exponent, and calculates in each case the fraction of values lying within 1, 2 and 3σ of the mean. If the fractions are close to the expected values of 68.3%, 95.5% and 99.7%, then the relevant distribution is considered to apply. In these cases, the summary file contains the calculated mean and reports the standard deviation as the 1σ uncertainty.

If the file is not well described by a normal-type distribution, then the code reports the median of the distribution and takes the values at 15.9% and 84.1% of the distribution as the lower and upper limits.

Inspecting the output

It is often useful to directly inspect the probability distributions. In the utilities directory there is a small shell script, utilities/plot.sh, which will plot the histogram of results together with a bar showing the value and its uncertainty as derived above. It will create PNG graphics files for easy inspection.

The script requires that you ran the code with the -v 3 option, and that you have gnuplot installed. It takes one optional parameter, the prefix of the files generated by neat. So, for example, if you've run 10,000 iterations on example/ngc6543_3cols.dat, then there will now be roughly 120 files in the example directory, with names like example/ngc6543_3cols.dat_mean_cHb, example/ngc6543_3cols.dat_Oii_abund_CEL, etc. You can then generate plots of the probability distributions for the results by typing:

% ../utilities/plot.sh ngc6453.dat

Running the code without the optional parameter will generate plots for all files with names ending in binned in the working directory.

Questions?

If the documentation above is in some way lacking, and you've come to the end without being able to get good results from your line list, please get in touch! Contact any of the developers and we will do our best to help, and to update the manual accordingly.