Files indexed by production date
Every day the application creates a file named file_YYYYMMDD.csv
, where YYYYMMDD
is the creation date. But sometimes the generation fails and no files are generated for several days.
I would like an easy way in a bash or sh script to find the filename of the most recent file that was released before a given key date.
Typical usage: Find the last file generated, not counting those released after May 1st.
thanks for the help
a source to share
This script avoids:
- Reuse
sed
in a loop - Parsing
ls
- Creating a subshell in a loop
while
- Handling files that do not match the name pattern
file_*.csv
Here's the script:
#!/bin/bash
while read -r file
do
date=${file#*_} # strip off everything up to and including the underscore
date=${date%.*} # strip off the dot and everything after
if [[ $date < $1 ]]
then
break
fi
done < <(find -name "file_*.csv" | sort -r)
# do something with $file, such as:
echo "$file"
Edit:
With Bash> = 3.2, you can do this using a regex:
#!/bin/bash
regex='file_([[:digit:]]+).csv'
while read -r file
do
[[ $file =~ $regex ]]
date=${BASH_REMATCH[1]}
if [[ $date < $1 ]]
then
break
fi
done < <(find -name "file_*.csv" | sort -r)
# do something with $file, such as:
echo "$file"
a source to share
Try the following:
#!/bin/bash
ls -r | while read fn; do
date=`echo $fn | sed -e 's/^file_\([0-9]*\)\.csv$/\1/'` || continue
if [ $date -lt $1 ]; then
echo $fn
exit
fi
done
Just call this script with the original date you want to compare with. Replace -lt
with -le
if you want to include the key date.
Edit: An alternative solution, without echoing the variable. Please note that I haven't tested it, but it should work too.
#!/bin/bash
ls -r | sed -e 's/^file_\([0-9]*\)\.csv$/\1/' | while read date; do
if [ $date -lt $1 ]; then
echo "file_${date}.csv"
exit
fi
done
a source to share
Sorting filenames sorted by man 1 will fail if there is a newline in the filename.
Instead, we should use something like:
touch $'filename\nwith\777pesky\177chars.txt' # create a test file
ls -1db *
find ... -print0 | LC_ALL=C sort0 ...
cm
Find all used extensions in subdirectories,
a source to share