Visual Processing and Image Analysis: A Deep Dive

Human Visual System (HVS)

Optic axis shifted ~ 5 degrees towards temple | Cornea is main refractive surface | Iris determines size of pupil | Pupil serves as an aperture | Retina receives wavelengths between 380-950 nm | We can see 380-770 nm, more red than blue, and 70-85% of white light reaches retina | Lens allows for focus by changing curvature | In young eye, cornea absorbs most of radiation below 300 nm, and lens filters out wavelength below 380 nm | Rods are cylinders and for low light, cones are conical and for high light | No rods in fovea, and cones are smaller in fovea, but densely packed | Cones have 3 sets, for red(580 nm), green(540nm), and blue(450nm) | Perception of color based on how cone sensors are excited | Rods are more sensitive across spectrum | 20 times as many rods as cones | Projection uses (X,Y,Z) = (A,B,C), (x,y) = (a,b), f = focal length, and concept of similar triangles | a/f = A/C and b/f = B/C or (a,b) = f/c * (A,B) = (fA/C, fB/C) | (x,y) = f/Z * (X,Y)


Binary Image Processing (BIP)

1 = black, 0 = white | I = [I(i,j)], I(i,j) = 0 initially | Thresholding = {J(i,j) = 0 if I(i,j) >= T, J(i,j) = 1 if I(i,j) < T} | Binary Majority only applicable for odd # of variables | Whichever value is majority of variables is chosen | Any binary operation can be done using NOT, AND, and OR | J_1 = NOT(I_1) reverses contrast, binary negative | J_2 = AND(I_1,I_2) = I_1 inter I_2 | J_3 = OR(I_1,I_2) = I_1 U I_2 | XOR(I_1,I_2) = OR(AND[I_2,NOT(I_1)],AND[NOT(I_2), I_1]) | Blob Coloring assigns region # to pixels, then sees which regions are connected | Can remove holes in blob, and smaller blobs by setting all blobs that dont have the same region # as the biggest blob to 0, and then NOT the image once again set all blobs that don’t have the same region # as the biggest blob to 1 then NOT again | Expand = dilate, Shrink = erode | Window is a geometric relationship between pixels | Always will cover an odd number of pixels | B*I(i,j) = {I(i-m,j-n);(m,n) in B} is the set of image pixels covered by the window when it is centered at (i,j) | J(i,j) = G{B*I(i,j)} = G{I(i-m,j-n);(m,n) in B} | When window edge is outside of image bounds, fill with nearest image pixel, called replication | J_1=Dilate(I,B) if J_1(i,j) = OR{I(i-m,j-n); (m,n) in B} | J_2 = Erode(I,B) if J_2(i,j) = AND{I(i-m,j-n); (m,n) in B} | J_3 = Median(I,B) if J_3(i,j) = MAJ{I(i-m,j-n);(m,n) in B} | Dilate increases size of black objects | Erode decreases size of black objects | Dilate removes holes of too small size, and gaps of too narrow width | Dilation of black = erosion of white | Erosion removes objects of too small size, and peninsulas of too narrow width | Erosion of black = dilation of white | Erosion and dilation not complete inverses of each other | Median is similar to both in some ways | Median removes objects and holes of too small size, and gaps and peninsulas of too narrow width | Usually doesn’t change size of objects | Median is its own dual MEDIAN[NOT(I)] = NOT[MEDIAN(I)] | Is a shape smoother | Open(I,B) = Dilate[Erode(I,B),B] | Close(I,B) = Erode[Dilate(I,B),B] | Open/Close similar to median | Open removes too small objects/fingers but not holes, gaps, or bays | Close removes too small holes/gaps, but not objects or peninsulas | Open/Close generally doesn’t affect size | Open for too small black objects | Close for too small white objects | Open-Close(I,B) =Open[Close(I,B),B] | Close-Open(I,B) = Close[Open(I,B),B] | Both Open-Close and Close-Open remove too small objects without affecting size too much both are similar to median except they smooth more | Open-Close usually links neighboring objects together | Close-Open usually links neighboring holes together | Skeletonization is I_n = ERODE[…ERODE[I_0,B],…B] n consectuive erosions, N = max{n:I_n*f} f = empty set (largest number of iterations before I_n disappears, S_n=I_n inter NOT[Open(I_n,B)], SKEL(I_0,B) = S_1 U S_2 U … U S_n


Image Processing Operations (PO)

Histogram contains no spatial information | Average optical density can be computed from histogram using
Equation
where kth term = (brightness level k)x(# occurences of k) | Low AOD = underexposed, High AOD = overexposed | J(i,j) = f[I(i,j)], 0 =>=>=> 0 means a brighter result, L 0, J(i,j) = P*I(i,j), so same constant P multiplies every pixel, J(i,j) = INT[P*I(i,j) + .5], where INT[R] = nearest integer that is <= R | J will have a broader gray level range if P > 1