Reading Data From a Txt and Storing Numbers to an Array

11. Reading and Writing Information Files: ndarrays

By Bernd Klein. Concluding modified: 01 February 2022.

On this folio ➤

There are lots of ways for reading from file and writing to data files in numpy. We will discuss the different ways and corresponding functions in this chapter:

savetxt
loadtxt
tofile
fromfile
salvage
load
genfromtxt

Saving textfiles with savetxt

Scrabble with the Text Numpy, read, write, array

The start two functions we will cover are savetxt and loadtxt.

In the post-obit simple case, we define an assortment x and salvage information technology as a textfile with savetxt:

            import            numpy            as            np            10            =            np            .            assortment            ([[            1            ,            ii            ,            3            ],            [            iv            ,            5            ,            6            ],            [            7            ,            8            ,            ix            ]],            np            .            int32            )            np            .            savetxt            (            "exam.txt"            ,            x            )

The file "test.txt" is a textfile and its content looks similar this:

          [electronic mail protected]:~/Dropbox/notebooks/numpy$ more test.txt 1.000000000000000000e+00 2.000000000000000000e+00 three.000000000000000000e+00 4.000000000000000000e+00 5.000000000000000000e+00 6.000000000000000000e+00 7.000000000000000000e+00 8.000000000000000000e+00 9.000000000000000000e+00

Attention: The above output has been created on the Linux control prompt!

Information technology's as well possible to print the array in a special format, similar for instance with three decimal places or as integers, which are preceded with leading blanks, if the number of digits is less than 4 digits. For this purpose we assign a format string to the third parameter 'fmt'. We saw in our first example that the default delimeter is a bare. We can modify this behaviour past assigning a string to the parameter "delimiter". In most cases this string volition consist solely of a single grapheme but it can be a sequence of character, similar a smiley " :-) " too:

            np            .            savetxt            (            "test2.txt"            ,            ten            ,            fmt            =            "            %ii.3f            "            ,            delimiter            =            ","            )            np            .            savetxt            (            "test3.txt"            ,            x            ,            fmt            =            "            %04d            "            ,            delimiter            =            " :-) "            )

The newly created files await like this:

          [email protected]:~/Dropbox/notebooks/numpy$ more test2.txt  1.000,2.000,three.000 4.000,five.000,6.000 7.000,eight.000,9.000          [email protected]:~/Dropbox/notebooks/numpy$ more test3.txt  0001 :-) 0002 :-) 0003 0004 :-) 0005 :-) 0006 0007 :-) 0008 :-) 0009

The complete syntax of savetxt looks like this:

savetxt(fname, X, fmt='%.18e', delimiter=' ', newline='\n', header='', footer='', comments='# ')

Parameter	Meaning
X	array_like Data to be saved to a text file.
fmt	str or sequence of strs, optional A single format (%10.5f), a sequence of formats, or a multi-format string, e.1000. 'Iteration %d -- %x.5f', in which case 'delimiter' is ignored. For circuitous 'X', the legal options for 'fmt' are: a) a unmarried specifier, "fmt='%.4e'", resulting in numbers formatted like "' (%s+%sj)' % (fmt, fmt)" b) a total string specifying every real and imaginary role, e.g. "' %.4e %+.4j %.4e %+.4j %.4e %+.4j'" for iii columns c) a listing of specifiers, one per column - in this case, the real and imaginary part must have separate specifiers, e.k. "['%.3e + %.3ej', '(%.15e%+.15ej)']" for 2 columns
delimiter	A string used for separating the columns.
newline	A string (e.yard. "\northward", "\r\n" or ",\north") which will terminate a line instead of the default line ending
header	A Cord that will be written at the beginning of the file.
footer	A String that will be written at the end of the file.
comments	A String that will exist prepended to the 'header' and 'footer' strings, to mark them as comments. The hash tag '#' is used as the default.

Loading Textfiles with loadtxt

Nosotros will read in at present the file "test.txt", which we have written in our previous subchapter:

              y              =              np              .              loadtxt              (              "exam.txt"              )              print              (              y              )

OUTPUT:

[[ 1.  two.  iii.]  [ 4.  5.  vi.]  [ 7.  8.  9.]]

              y              =              np              .              loadtxt              (              "test2.txt"              ,              delimiter              =              ","              )              print              (              y              )

OUTPUT:

[[ 1.  two.  3.]  [ four.  5.  half-dozen.]  [ seven.  8.  9.]]

Zip new, if we read in our text, in which we used a smiley to separator:

              y              =              np              .              loadtxt              (              "test3.txt"              ,              delimiter              =              " :-) "              )              impress              (              y              )

OUTPUT:

[[ 1.  2.  3.]  [ 4.  5.  6.]  [ vii.  8.  9.]]

It's also possible to choose the columns by index:

              y              =              np              .              loadtxt              (              "test3.txt"              ,              delimiter              =              " :-) "              ,              usecols              =              (              0              ,              2              ))              impress              (              y              )

OUTPUT:

[[ ane.  3.]  [ 4.  vi.]  [ 7.  9.]]

We will read in our adjacent example the file "times_and_temperatures.txt", which we have created in our chapter on Generators of our Python tutorial. Every line contains a time in the format "hh::mm::ss" and random temperatures between x.0 and 25.0 degrees. We take to convert the time string into float numbers. The fourth dimension will exist in minutes with seconds in the hundred. We define first a part which converts "hh::mm::ss" into minutes:

              def              time2float_minutes              (              time              ):              if              type              (              time              )              ==              bytes              :              time              =              time              .              decode              ()              t              =              time              .              split              (              ":"              )              minutes              =              float              (              t              [              0              ])              *              lx              +              float              (              t              [              1              ])              +              float              (              t              [              2              ])              *              0.05              /              3              return              minutes              for              t              in              [              "06:00:10"              ,              "06:27:45"              ,              "12:59:59"              ]:              impress              (              time2float_minutes              (              t              ))

OUTPUT:

360.1666666666667 387.75 779.9833333333333

You lot might have noticed that we bank check the blazon of time for binary. The reason for this is the use of our function "time2float_minutes in loadtxt in the following example. The keyword parameter converters contains a dictionary which tin hold a function for a column (the key of the column corresponds to the key of the dictionary) to convert the cord information of this column into a bladder. The string data is a byte string. That is why nosotros had to transfer it into a a unicode string in our function:

              y              =              np              .              loadtxt              (              "times_and_temperatures.txt"              ,              converters              =              {              0              :              time2float_minutes              })              impress              (              y              )

OUTPUT:

[[  360.     twenty.1]  [  361.5    16.i]  [  363.     16.9]  ...,   [ 1375.v    22.5]  [ 1377.     11.1]  [ 1378.5    15.two]]

            # delimiter = ";" , # i.e. use ";" every bit delimiter instead of whitespace

tofile

tofile is a function to write the content of an array to a file both in binary, which is the default, and text format.

A.tofile(fid, sep="", format="%s")

The data of the A ndarry is always written in 'C' lodge, regardless of the order of A.

The data file written by this method can be reloaded with the role fromfile().

Parameter	Meaning
fid	can be either an open file object, or a string containing a filename.
sep	The cord 'sep' defines the separator between assortment items for text output. If it is empty (''), a binary file is written, equivalent to file.write(a.tostring()).
format	Format string for text file output. Each entry in the array is formatted to text by first converting it to the closest Python blazon, and and so using 'format' % item.

Remark:

Data on endianness and precision is lost. Therefore it may not be a good idea to utilise the office to archive information or transport data between machines with dissimilar endianness. Some of these problems can be overcome by outputting the data as text files, at the expense of speed and file size.

              dt              =              np              .              dtype              ([(              'time'              ,              [(              'min'              ,              int              ),              (              'sec'              ,              int              )]),              (              'temp'              ,              float              )])              x              =              np              .              zeros              ((              1              ,),              dtype              =              dt              )              x              [              'fourth dimension'              ][              'min'              ]              =              10              x              [              'temp'              ]              =              98.25              print              (              x              )              fh              =              open              (              "test6.txt"              ,              "bw"              )              x              .              tofile              (              fh              )

OUTPUT:

Alive Python training

instructor-led training course

Upcoming online Courses

Information Analysis With Python

09 Mar 2022 to 11 Mar 2022
18 May 2022 to xx May 2022
31 Aug 2022 to 02 Sep 2022
19 October 2022 to 21 Oct 2022

Enrol here

fromfile

fromfile to read in information, which has been written with the tofile office. It'due south possible to read binary data, if the data type is known. It's also possible to parse but formatted text files. The data from the file is turned into an array.

The general syntax looks like this:

numpy.fromfile(file, dtype=float, count=-1, sep='')

Parameter	Significant
file	'file' can exist either a file object or the name of the file to read.
dtype	defines the information type of the array, which volition be constructed from the file data. For binary files, it is used to determine the size and byte-club of the items in the file.
count	defines the number of items, which will be read. -1 means all items will be read.
sep	The string 'sep' defines the separator betwixt the items, if the file is a text file. If it is empty (''), the file will be treated as a binary file. A space (" ") in a separator matches naught or more whitespace characters. A separator consisting solely of spaces has to friction match at least i whitespace.

              fh              =              open              (              "test4.txt"              ,              "rb"              )              np              .              fromfile              (              fh              ,              dtype              =              dt              )

OUTPUT:

array([((4294967296, 12884901890), i.0609978957e-313),        ((30064771078, 38654705672), two.33419537056e-313),        ((55834574860, 64424509454), three.60739284543e-313),        ((81604378642, 90194313236), 4.8805903203e-313),        ((107374182424, 115964117018), 6.1537877952e-313),        ((133143986206, 141733920800), vii.42698527006e-313),        ((158913789988, 167503724582), viii.70018274493e-313),        ((184683593770, 193273528364), 9.9733802198e-313)],        dtype=[('time', [('min', '<i8'), ('sec', '<i8')]), ('temp', '<f8')])

              import              numpy              every bit              np              import              bone              # platform dependent: departure between Linux and Windows              #data = np.arange(fifty, dtype=np.int)              data              =              np              .              arange              (              50              ,              dtype              =              np              .              int32              )              data              .              tofile              (              "test4.txt"              )              fh              =              open              (              "test4.txt"              ,              "rb"              )              # 4 * 32 = 128              fh              .              seek              (              128              ,              os              .              SEEK_SET              )              x              =              np              .              fromfile              (              fh              ,              dtype              =              np              .              int32              )              print              (              x              )

OUTPUT:

[32 33 34 35 36 37 38 39 twoscore 41 42 43 44 45 46 47 48 49]

Attention:

It can crusade problems to use tofile and fromfile for information storage, because the binary files generated are non platform independent. There is no byte-society or data-type information saved by tofile. Data tin can exist stored in the platform contained .npy format using save and load instead.

All-time Practice to Load and Save Data

The recommended way to store and load data with Numpy in Python consists in using load and save. We also use a temporary file in the following :

              import              numpy              as              np              impress              (              x              )              from              tempfile              import              TemporaryFile              outfile              =              TemporaryFile              ()              x              =              np              .              arange              (              10              )              np              .              salvage              (              outfile              ,              10              )              outfile              .              seek              (              0              )              # Simply needed here to simulate endmost & reopening file              np              .              load              (              outfile              )

OUTPUT:

[32 33 34 35 36 37 38 39 twoscore 41 42 43 44 45 46 47 48 49] array([0, 1, 2, 3, four, five, six, 7, 8, 9])

and nonetheless another fashion: genfromtxt

At that place is yet another way to read tabular input from file to create arrays. As the proper noun implies, the input file is supposed to be a text file. The text file tin exist in the grade of an archive file as well. genfromtxt tin can procedure the archive formats gzip and bzip2. The type of the annal is determined by the extension of the file, i.east. '.gz' for gzip and bz2' for an bzip2.

genfromtxt is slower than loadtxt, simply it is capable of coping with missing information. It processes the file data in two passes. At first it converts the lines of the file into strings. Thereupon information technology converts the strings into the requested data type. loadtxt on the other hand works in 1 go, which is the reason, why it is faster.

recfromcsv(fname, **kwargs)

This is not actually another way to read in csv data. 'recfromcsv' basically a shortcut for

np.genfromtxt(filename, delimiter=",", dtype=None)

Live Python preparation

instructor-led training course

Upcoming online Courses

Information Assay With Python

09 Mar 2022 to eleven Mar 2022
18 May 2022 to 20 May 2022
31 Aug 2022 to 02 Sep 2022
19 Oct 2022 to 21 October 2022

Enrol here

Reading Data From a Txt and Storing Numbers to an Array

Source: https://python-course.eu/numerical-programming/reading-and-writing-data-files-ndarrays.php

Reading Data From a Txt and Storing Numbers to an Array

11. Reading and Writing Information Files: ndarrays

Saving textfiles with savetxt

Loading Textfiles with loadtxt

OUTPUT:

OUTPUT:

OUTPUT:

OUTPUT:

OUTPUT:

OUTPUT:

tofile

OUTPUT:

fromfile

OUTPUT:

OUTPUT:

All-time Practice to Load and Save Data

OUTPUT:

and nonetheless another fashion: genfromtxt

recfromcsv(fname, **kwargs)

0 Response to "Reading Data From a Txt and Storing Numbers to an Array"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel