出租车乘客上下车的数据文件处理
原始文件内容为:
name,time,jd,wd,status,v,angle,
粤B13K97,2011/04/18 00:00:26,114.044151,22.531418,1,75,5,
粤B13K97,2011/04/18 00:00:56,114.038452,22.530817,0,78,5,
粤B13K97,2011/04/18 00:01:56,114.026199,22.531134,0,84,6,
粤B13K97,2011/04/18 00:02:26,114.020035,22.532049,0,80,5,
粤B13K97,2011/04/18 00:02:56,114.013885,22.530767,1,82,5,
粤B13K97,2011/04/18 00:03:04,114.012283,22.530399,1,80,5,
粤B13K97,2011/04/18 00:03:26,114.007767,22.529682,1,79,5,
粤B13K97,2011/04/18 00:03:56,114.001984,22.530184,1,75,6,
粤B13K97,2011/04/18 00:04:26,113.996498,22.528917,0,75,5,
粤B13K97,2011/04/18 00:04:56,113.991653,22.526068,0,73,5,
粤B13K97,2011/04/18 00:05:26,113.986450,22.523933,0,74,5,
粤B13K97,2011/04/18 00:06:26,113.975067,22.522949,1,78,5,
粤B13K97,2011/04/18 00:06:34,113.973465,22.522949,1,80,5,
粤B13K97,2011/04/18 00:06:56,113.968849,22.522932,1,83,5,
粤B13K97,2011/04/18 00:07:56,113.956467,22.523268,1,69,6,
粤B13K97,2011/04/18 00:08:26,113.951736,22.523832,1,54,6,
粤B13K97,2011/04/18 00:08:56,113.949387,22.525534,1,49,0,
粤B13K97,2011/04/18 00:09:26,113.949799,22.529217,0,47,0,
......
其中第五个字段表示出租车的载客信息, 1表示有乘客 , 0表示空载. 需要将所有上下车的乘车信息挑出来, 即由1变为0或由0变为1:
直接上Shell脚本:
sed -e 1d FileName | awk -F, '{ now_f5=int($5) if((now_f5 != pro_f5) && (pro != "")) print pro "\n" $0 pro_f5=int($5); pro=$0}' | uniq
处理后的数据为:
粤B13K97,2011/04/18 00:00:26,114.044151,22.531418,1,75,5,
粤B13K97,2011/04/18 00:00:56,114.038452,22.530817,0,78,5,
粤B13K97,2011/04/18 00:02:26,114.020035,22.532049,0,80,5,
粤B13K97,2011/04/18 00:02:56,114.013885,22.530767,1,82,5,
粤B13K97,2011/04/18 00:03:56,114.001984,22.530184,1,75,6,
粤B13K97,2011/04/18 00:04:26,113.996498,22.528917,0,75,5,
粤B13K97,2011/04/18 00:05:26,113.986450,22.523933,0,74,5,
粤B13K97,2011/04/18 00:06:26,113.975067,22.522949,1,78,5,
粤B13K97,2011/04/18 00:08:56,113.949387,22.525534,1,49,0,
粤B13K97,2011/04/18 00:09:26,113.949799,22.529217,0,47,0,
......
安全BASH脚本的开场白(仅供参考)
#! /bin/bash -- # '--'标志选项的结束,禁止其余的选项处理. # 任何'--'之后的参数将作为文件名和参数对待. # 参数 '-'与之等价. 这可以避免某种程度的欺骗攻击(Spoofing Attack). # IFS变量中存储着输入字段分割符,它会影响Shell接下来对输入数据解释的方式. # 为了防止某些Shell导入该变量的一个外部设置, # 在脚本开始时将IFS重设为标准值(空格,Tab和换行): IFS=$' \t\n' # 为了执行我们所预期的命令: # 首先确定unalias不是一个被重新定义的函数(POSIX中,unset是一个特殊Shell内部命令, # 它在函数和普通内部命令之前执行,所以你不必担心它被重新定义为一个函数, # 不过,GNU/BASH却可以将其定义为函数,不解。。。): unset -f unalias # 删除所有命令别名,前面加'\'是为了保证unalias不是一个别名: \unalias -a # 确保command不是一个函数, 它本身是一个普通的内部命令: unset -f command # 设置一个可以信赖的PATH变量: # 'command -p', 表示使用$PATH的默认值,并避开Shell函数,以执行后面的命令. # 'getconf', 列出系统配置变量值. SYSPATH="S(command -p getconf PATH 2>/dev/null)" if [[ -z "$SYSPATH" ]]; then SYSPATH="/usr/bin:/bin" fi PATH="$SYSPATH:$PATH" # 确保所有的子进程继承我们的安全查找路径: export PATH