Monday, January 17, 2011

Processing vs. Inversion

Gearing up for a new semester of teaching Geophysical Data Processing has me updating slides and thinking about the big picture. Part of that has been include an opening lecture about world oil production and CO2 since both will impact employment of geophysicists for decades to come. I have written elsewhere on CO2 (plus here) and may add an entry on oil production soon.

Anyway, another key concept is the distinction between data processing (P) and inversion (I). It is important enough that Jon Claerbout has written an excellent book with that phrase in the title... EARTH SOUNDINGS ANALYSIS: Processing versus Inversion (free download here).

The main point for me is a question of input and output. Observed data is input to both P and I, but the outputs are fundamentally different things. Processing spits out processed data that goes on down the line to the next process, ultimately yielding a migrated image of the subsurface. Inversion is designed to output not data, but an estimate of earth parameters.  For seismic inversion, this means velocity and density as a function of (x,y,z). I am thinking here of prestack inversion, but a similar argument can be made for poststack inversion. The later case, however, is a hybrid of P and I since the starting point for this kind of inversion is a migrated image.

For processing, at least seismic data processing, it is only a mild feedback loop and that is implemented manually.  By this I mean, each process is run and the output inspected by an expert user (hopefully) who judges the output data based on experience.  Parameter updates tend to also be done manually with the user tweaking the process parameters and re-running the job a few times at most.  In well-worked geographic areas, many processes are virtually automatic since parameters are well understood from previous jobs.  Along the road from raw seismic field data to final seismic image, there are likely to be dozens of individual processes.  This decoupling allows data quality and improvement to be judged many times and each process tends to be relatively quick, inexpensive, and involve the minimum number of user-chosen parameters.  A decoupled processing flow works fine so long as the earth is well behaved.  But in situations of extreme topography, velocity contract, and structure this simplified way of processing data breaks down and is unable to deliver a geologically reasonable subsurface image.  In such cases, one grand processing scheme called prestack depth migration is applied.  Cost wise, it may rival inversion, but the output is still a seismic image, albeit an expensive one.

In the case of inversion, there three key questions.  1) How much physics do we include in the simulation?  This is where physics and finance collide, since more physics means longer compute time and higher cost.  2) How do we compare simulated and observed data... simple difference, least squares, correlation, something more exotic?  3) How can we update the earth parameters based on the misfit?  This brings in the whole field of optimization theory, conjugate gradients, genetic algorithms, and so on.

To help get the basic ideas across, I used GraphViz (blog entry, web site) to make a couple of flow charts. In each, the input is shown in a blue box and the output is red. The code for each is also given below.

Processing Flow Diagram

Inversion Flow Diagram

*************************************
GraphViz code for processing flow chart:

"Observed Data" ->Process
"Proc Params"->Process
Process->"Judge Quality"
"Judge Quality"->"OK"
"Judge Quality"->"Not OK"
"Not OK"->"Update Params"
"Update Params"->Process
"OK"->"Processed Data"
"Processed Data"->"Next Process"
"Observed Data" [shape=box, color=blue]
"Earth Model" [shape=box, color=red]

*************************************
Code for inversion flow chart:

"Initial Earth Params" ->Simulation
Simulation->"Sim Data"
"Observed Data"->Compare
"Sim Data"->Compare
Compare->"Fit OK"
Compare->"Fit Not OK"
"Fit Not OK"->"Update Params"
"Update Params"->Simulation
"Fit OK"->"Earth Model"
"Observed Data" [shape=box, color=blue]
"Earth Model" [shape=box, color=red]