US CADx models
Ultrasound imaging is an adjunct
to diagnostic mammography, where CADx models could be
used for improving diagnostic
accuracy. CADx models developed for US scans date back to
late 1990s. In this section, we
review studies that apply CADx systems to breast sonography
or US-mammography combination in
distinguishing malignant from benign lesions. A
summary list for the primary US
CADx models is presented in TABLE 2.
Giger et al. classified malignant
lesions in a database of 184 digitized US images [47]. Biopsy,
cyst aspiration or image
interpretation alone were used to confirm benign lesions, whereas
malignancy was proven at biopsy.
The authors utilized an LDA model to differentiate between
benign and malignant lesions
using five computer-extracted features based on lesion shape and
margin, texture, and posterior
acoustic attenuation (two features). ROC analysis yielded AUCs
of 0.94 for the entire database
and 0.87 for the database that only included biopsy- and cyst-
proven cases. The authors
concluded that their analysis demonstrated that computerized
analysis could improve the
specificity of breast sonography.
Chen et al. developed an ANN to
classify malignancies on US images [48]. A physician
manually selected sub-images
corresponding to a suspicious tumor region followed by
computerized analysis of
intensity variation and texture information. Texture correlation between
neighboring pixels was used as the input to the ANN. The training and testing
dataset
included 140 biopsy-proven breast
tumors (52 malignant). The performance was assessed by
AUC, sensitivity and specificity
metrics, which yielded an AUC of 0.956 with 98% sensitivity
and 93% specificity at a
threshold level of 0.2. The authors concluded that their CADx model
was useful in distinguishing
benign and malignant cases, yet also noted that larger datasets
could be used to improve the
performance.
Later, Chen et al. improved on a
previous study [48] and devised an ANN model composed
of three components: feature
extraction, feature selection, and classification of benign and
malignant lesions [49]. The study
used two sets of biopsy-proven lesions; the first set with 160
digitally stored lesions (69
malignant) and the second set with 111 lesions (71 malignant) in
hard-copy images that were
obtained with the same US system. Hard-copy images were
digitized using film scanners.
Seven morphologic features were extracted from each lesion
using an image-processing
algorithm. Given the classifier, forward stepwise regression was
employed to define the best
performing features. These features were used as inputs to a two-
layer feed-forward ANN. For the
first set, the ANN achieved an AUC of 0.952, 90.6%
sensitivity and 86.6%
specificity. For the second set, the ANN achieved an AUC of 0.982,
96.7% sensitivity and 97.2%
specificity. The ANN model trained on each dataset was
demonstrated to be statistically
extendible to other datasets at a 5% significance level. The
authors concluded that their ANN
model was an effective and robust approach for lesion
classification, performing better
than the counterparts published earlier [47,48].
Horsch et al. explored three
aspects of an LDA classifier that was based on automatic
segmentation of lesions and
automatic extraction of lesion shape, margin, texture and posterior
acoustic behavior [50]. The study
was conducted using a database of 400 cases with 94
malignancies, 124 complex cysts
and 182 benign lesions. The reference standard was either
biopsy or aspiration. First,
marginal benefit of adding a feature to the LDA model was
investigated. Second, the
performance of the LDA model in distinguishing carcinomas from
different benign lesions was
explored. The AUC values for the LDA model were 0.93 for
distinguishing carcinomas from
complex cysts and 0.72 for differentiating fibrocystic disease
from carcinoma. Finally, eleven
independent trials of training and testing were conducted to
validate the LDA model.
Validation resulted in a mean AUC of 0.87 when computer-extracted
features from automatically
delineated lesion margins were used. There was no statistically
significant difference between
the best two- and four-feature classifiers; therefore, adding
features to the LDA model did not
improve the performance.
Sahiner et al. investigated
computer vision techniques to characterize breast tumors on 3D US
volumetric images [51]. The
dataset was composed of masses from 102 women who underwent
either biopsy or fine-needle aspiration
(56 had malignant masses). Automated mass
segmentation in 2D and 3D, as
well as feature extraction followed by LDA, were implemented
to obtain malignancy scores.
Stepwise feature selection was employed to reduce eight
morphologic and 72 texture features
into a best-feature subset. An AUC of 0.87 was achieved
for the 2D-based classifier,
while the AUC for the 3D-based classifier was 0.92. There was no
statistically significant
difference between the two classifiers (p = 0.07). The AUC values of
the four radiologists fell in the
range of 0.84 to 0.92. Comparing the performance of their model
to that of radiologists, the
difference was not statistically significant (p = 0.05). However, the
partial AUC for their model was
significantly higher than those of the three radiologists (p <
0.03, 0.02 and 0.001).
Drukker et al. used various
feature segmentation and extraction schemes as inputs to a Bayesian
neural network (BNN) classifier
with five hidden layers [52]. The purpose of the study was to
evaluate a CADx workstation in a
realistic setting representative of clinical diagnostic breast
US practice. Benign or malignant
lesions that were verified at biopsy or aspiration, as well as those determined
through imaging characteristics on US scans, MR images and mammograms,
were used for the analysis. The
authors included non-biopsied lesions in the dataset to make
the series consecutive, which
more accurately reflects clinical practice. The inputs to the
network included lesion
descriptors consisting of the depth:width ratio, radial gradient index,
posterior acoustic signature and
autocorrelation texture feature. The output of the network
represented the probability of
malignancy. The study was conducted on a patient population
of 508 (101 had breast cancer)
with 1046 distinct abnormalities (157 cancerous lesions).
Comparing the current radiology
practice with the CADx workstation, the CADx scheme
achieved an AUC of 0.90,
corresponding to 100% sensitivity at 30% specificity, while
radiologists performed with 77%
specificity for 100% sensitivity when only nonbiopsied
lesions were included. When only
biopsy-proven lesions were analyzed, computerized lesion
characterization outperformed the
radiologists.
In routine clinical practice,
radiologists often combine the results from mammography and US,
if available, when making
diagnostic decisions. Several studies demonstrated that CADx could
be useful in the differentiation
of benign findings from malignant breast masses when
sonographic data are combined
with corresponding mammographic data. Horsch et al.
evaluated and compared the
performance of five radiologists with different expertise levels
and five imaging fellows with or
without the help of a BNN [53]. The BNN model utilized a
computerized segmentation of the
lesion. Mammographic features used as the input included
spiculation, lesion shape, margin
sharpness, texture and gray level. Sonographic input features
included lesion shape, margin,
texture and posterior acoustic behavior. All features were
automatically extracted by an
image-processing algorithm. This retrospective study examined
a total of 359 (199 malignant)
mammographic and 358 (67 malignant) sonographic images.
Additionally, 97 (39 malignant)
multimodality cases (both mammogram and sonogram) were
used for testing purposes only.
Biopsy was the reference standard. The performances of each
radiologist/imaging fellow or
pair of observers were quantified by the AUC, sensitivity and
specificity metrics. Average AUC
without BNN was 0.87 and with BNN was 0.92 (p < 0.001).
The sensitivities without and
with BNN were 0.88 and 0.93, respectively (p = 0.005). There
was not a significant difference
in specificities without and with BNN (0.66 vs 0.69, p = 0.20).
The authors concluded that the
performance of the radiologists and imaging fellows increased
significantly with the help of
the BNN model.
In another multimodality study,
Sahiner et al. investigated the effect of a multimodal CADx
system (using mammography and US
data) in discriminating between benign and malignant
lesions [54]. The dataset for the
study consisted of 13 mammography features (nine
morphologic, three spiculation
and one texture) and eight 3D US features (two morphologic
and six texture) that were
extracted from 67 biopsy-proven masses (35 malignant). Ten
experienced readers first gave a
malignancy score based on mammography only, then re-
evaluated based on mammography
and US combined, and were finally allowed to change their
minds given the CADx system’s
evaluation of the mass. The CADx system automatically
extracted the features, which
were then fed into a multimodality classifier (using LDA) to give
a risk score. The results were
compared using ROC curves, which suggested statistically
significant improvement (p =
0.05) when the CADx system was consulted (average AUC =
0.95) over readers’ assessment of
combined mammography and US without the CADx
(average AUC = 0.93). Sahiner et
al. concluded that a CADx system combining the features
from mammography and US may have
the potential to improve radiologist’s diagnostic
decisions [54].
As discussed previously, a
variety of sonographic features (texture, margin and shape) are used
to classify benign and malignant
lesions. 2D/3D Doppler imaging provides additional
advantages in classification when
compared with grayscale, by demonstrating breast lesion
vascularity. Chang et al.
extracted features of tumor vascularity from 3D power Doppler US images of 221
lesions (110 benign) and devised an ANN to classify lesions [55]. The study
demonstrated that CADx, using 3D
power Doppler imaging, can aid in the classification of
benign and malignant lesions.
In addition to the aforementioned
studies, there are other works that developed and evaluated
CADx systems in differentiating
between benign and malignant lesions. Joo et al. developed
an ANN that was demonstrated to
have potential to increase the specificity of US
characterization of breast
lesions [56]. Song et al. compared an LR and an ANN in the context
of differentiating between
malignant and benign masses on breast sonograms from a small
dataset [57]. There was no
statistically significant difference between the performances of the
two methods. Shen et al.
investigated the statistical correlation between the computerized
sonographic features, as defined
by BI-RADS, and the signs of malignancy [58]. Chen and
Hsiao evaluated US-based CADx
systems by reviewing the methods used in classification
[59]. They suggested the
inclusion of pathologically specific tissue-and hormone-related
features in future CADx systems.
Gruszauskas et al. examined the effect of image selection
on the performance of a breast US
CADx system and concluded that their automated breast
sonography classification scheme
was reliable even with variation in user input [60]. Recently,
Cui et al. published a study
focusing on the development of an automated method segmenting
and characterizing the breast
masses on US images [61]. Their CADx system performed
similarly whether it used
automated segmentation or an experienced radiologist’s
segmentation. In a recent study,
Yap et al. designed a survey to evaluate the benefits of
computerized processing of US
images in improving the readers’ performance of breast cancer
detection and classification
[62]. The study demonstrated marginal improvements in
classification when
computer-processed US images alongside the originals are used in
distinguishing benign from
malignant lesions.