Data, big data, and Surgery
Department of Surgery
Hospital Clinico San Carlos
21 June 2023
The industrial revolution, which brought about significant changes in manufacturing, transportation, and communication, is slowly fading into the past. In its place, digitization is advancing with unprecedented determination, generating massive amounts of data in the form of 0s and 1s. Technology has made it possible to accumulate endless strings of bytes that help us represent, explain, and even predict reality with varying degrees of accuracy.
Surgery is one sector that will be greatly affected by this unstoppable process. The hope is that digital technology and data analysis will make procedures more precise and accurate, leading to better patient outcomes. But there are few success stories in our field because the digital transformation of surgery is not just about technology, it is a new culture. Surgeons need to learn how to collect and use data in a good and ethical way to solve complex problems and unmet needs.
Surgical Data and Big Data
Surgical data can come from many sources along the patient care path (1): medical records (clinical notes, imaging, tests, etc), preoperative planning, intraoperative monitoring, video-laparoscopic procedures (image, pressure, energy use), or potential new applications based on imaging (fluorescence and indocyanine green, augmented reality, etc), robotic surgery, postoperative monitoring and imaging, patient-reported data (wearables), and even clinical trials and social media. Other data about system performance (surgical teams, hospital administrative processes, etc) can also be added.
Big data (2) refers to large sets of data that are generated from various sources at high speed and diversity. These data cannot be handled by traditional data-processing software and require special techniques for analysis and integration. The three Vs of big data are volume, velocity, and variety (3). Volume is the amount of data, which can be in the order of petabytes for some organizations. Velocity is the rate at which data is produced and processed, with some smart devices operating in real-time or near real-time. Variety is the different types of data, such as structured, unstructured, and semi-structured data.
Structured data have a predefined format and meaning, such as patient demographics, vital signs, or lab tests. They can be easily stored, retrieved, and analyzed by electronic health record (EHR) systems and other applications. Unstructured data are not organized in a predefined format, such as clinical notes, reports, images, videos, etc. They can provide valuable information for surgical planning, decision-making, and outcome prediction, but they also pose challenges for data analysis and integration. Unstructured data require text mining and natural language processing (NLP) techniques to extract meaningful features and transform them into structured or semi-structured data that can be used by machine learning and deep learning models.
Another two Vs were later added to the description (4): veracity and value. Veracity means the quality or reliability of the data, which is crucial for making accurate and meaningful decisions based on big data analytics (Value). Not all data are equally trustworthy, and some data may be incomplete, inconsistent, inaccurate, or misleading. Therefore, veracity is an important aspect of big data that needs to be assessed and addressed before using it.
Appropriately acquired, trustworthy data would be valuable for addressing surgical problems and making more accurate and precise decisions if they were used to discover hidden patterns, make informed assumptions, and predict behaviour and outcomes. Yet, there are some additional steps that must be taken before this is possible. The first one is storage. Big data is often stored in a data lake, an IT infrastructure which can support various data types (Hadoop clusters, cloud object storage services, NoSQL databases, etc). It is important to understand the difference between data warehouses and data lakes. While data warehouses are commonly built on relational databases and contain structured data only, data lakes can store schemaless data (often referred to as unstructured data) on a distributed file system. This file system splits the huge data into blocks and distributes them in the cluster nodes. Once data are stored, there are many tools available to manage big data, such as Apache Hadoop, Apache Spark, and MongoDB. Some of the common solutions for analysis are presented in Table I.
Benefits and Challenges
Big data analytics could have a significant impact on various areas of surgery (5), such as:
1. Improvement of training modalities: by designing more effective and personalized training programmes for surgeons, based on their performance data, feedback, and learning preferences. It could also enable the use of simulation and virtual reality to enhance the skills and knowledge of surgical trainees.
2. Cognitive enhancement of surgical team members: by supporting the decision-making and problem-solving abilities of surgeons and other team members during surgery, by providing real-time information, guidance, and alerts. It could also help monitor and reduce the cognitive workload and stress levels of the surgical staff, and improve their communication and collaboration.
3. Procedural automation: by enabling the development and implementation of more advanced and reliable surgical robots, that can perform complex tasks autonomously or semi-autonomously, with minimal human intervention. It could also help optimize the workflow and efficiency of the surgical procedures, by automating some of the routine or repetitive steps.
4. Surgical research: by analyzing large-scale data from various sources, such as electronic health records, clinical trials, imaging, sensors, and social media. It could also help identify new trends, patterns, correlations, and causal relationships in surgical outcomes and complications, and support evidence-based practice and innovation.
5. Process management: increasing the quality and safety of surgical care by monitoring and evaluating the performance and outcomes of the surgical processes and providing feedback and recommendations for improvement. It could also help reduce the costs and risks of surgery by optimizing the use of resources, minimizing errors and adverse events, and enhancing patient satisfaction and engagement.
However, it also raises ethical, legal, and social issues that need to be addressed, such as privacy and confidentiality of patients, data quality and validity, accountability and responsibility, and impact and implications of data use (6). By engaging in ethical reflection and dialogue with peers, patients, and stakeholders, surgeons can ensure that their use of big data is responsible, respectful, and beneficial.
In summary, despite the potential benefits of big data analytics, surgeons must be ready to face the challenge of integrating this technology into their practice and building trust. Trust can be established by following quality standards, using data governance frameworks, and applying ethical principles. Then, they must get prepared to adopt smarter training modalities, supervise the learning of machines that can enhance cognitive function, and ultimately oversee autonomous surgery without allowing for decay in manual surgical skills. Playing a role in the development of this new technology demands partnering with data scientists to capture data across phases of care and provide clinical context. This collaboration can help to optimize the use of big data in surgery to improve patient outcomes.
Part of the charitable activity of the Society, BJS Academy is an online educational resource for current and future surgeons.
The Academy is comprised of five distinct sections: Continuing surgical education, Young BJS, Cutting edge, Scientific surgery and Surgical news. Although the majority of this is open access, additional content is available to BJS subscribers and strategic partners.