본문 바로가기
Developing../Python

Data Handling in Python - Json

by bents 2021. 2. 5.
  1. Import the json package.
  2. Read the data with load() or loads().
  3. Process the data.
  4. Write the altered data with dump() or dumps().

# 용어

serialization : The process of encoding JSON

This term refers to the transformation of data into series of bytes (hence serial) to be stored or transmitted across a network.  

deserialization : the reciprocal process of decoding data that has been stored or delivered in the JSON standard.

Marshaling and serialization are loosely synonymous in the context of remote procedure call, but semantically different as a matter of intent.

In particular, marshaling is about getting parameters from here to there, while serialization is about copying structured data to or from a primitive form such as a byte stream. In this sense, serialization is one means to perform marshaling, usually implementing pass-by-value semantics.

It is also possible for an object to be marshaled by reference, in which case the data "on the wire" is simply location information for the original object. However, such an object may still be amenable to value serialization.

1) Serialize

: write serialized json data to a native Python str object.

with open("data_file.json", "w") as write_file:
    json.dump(data, write_file)
    
    # json.dumps(data)
    # json.dumps(data, indent=4)
blackjack_hand = (8, "Q")
encoded_hand = json.dumps(blackjack_hand)
decoded_hand = json.loads(encoded_hand)

blackjack_hand == decoded_hand # tuple == list
blackjack_hand == tuple(decoded_hand) # tuple == tuple

2) Deserialize

with open("data_file.json", "r") as read_file:
    data = json.load(read_file)

# Example - counting후, 최대값 추출하기

# loading 생략
todos = json.loads(response.text)

# Map of userId to number of complete TODOs for that user
todos_by_user = {}

# Increment complete TODOs count for each user.
for todo in todos:
    if todo["completed"]:
        try:
            # Increment the existing user's count.
            todos_by_user[todo["userId"]] += 1
        except KeyError:
            # This user has not been seen. Set their count to 1.
            todos_by_user[todo["userId"]] = 1

# Create a sorted list of (userId, num_complete) pairs. 
# value 기준으로 sorting , reverse == descending
top_users = sorted(todos_by_user.items(), 
                   key=lambda x: x[1], reverse=True)

# Get the maximum number of complete TODOs.
max_complete = top_users[0][1]

# Create a list of all users who have completed
# the maximum number of TODOs.
# 최상위 값들만 골라내기(동일개수일 수 있으므로)
users = []
for user, num_complete in top_users:
    if num_complete < max_complete:
        break
    users.append(str(user))

max_users = " and ".join(users)

## 직렬화 안되는 자료구조도 byte로 직렬화 저장하기!

- python에서 수식에 들어가는 j는 복소수다. 따라서 직렬화가 안된다. 어떻게 할까?

1) 인코더 default 함수 만들기

def encode_complex(z):
    if isinstance(z, complex):
        return (z.real, z.imag)
    else:
        type_name = z.__class__.__name__
        print(z.__class__)
        raise TypeError(f"Object of type '{type_name}' is not JSON serializable")

# encoding 함수를 교체한다.
json.dumps(9 + 5j, default=encode_complex)

2) 인코더 클래스 만들기 / default 함수 override하기

class ComplexEncoder(json.JSONEncoder):
    def default(self, z):
        if isinstance(z, complex):
            return (z.real, z.imag)
        else:
            return super().default(z)
            
            
json.dumps(2 + 5j, cls=ComplexEncoder)

json.dumps(ComplexEncoder().encode(3 + 6j))

- 디코더 방법은?

def decode_complex(dct):
    if "__complex__" in dct:
        return complex(dct["real"], dct["imag"])
    return dct
    
with open("complex_data.json") as complex_data:
    data = complex_data.read()
    numbers = json.loads(data, object_hook=decode_complex)

numbers

 

 

source : realpython.com/python-json/

 

Working With JSON Data in Python – Real Python

In this tutorial you'll learn how to read and write JSON-encoded data using Python. You'll see hands-on examples of working with Python's built-in "json" module all the way up to encoding and decoding custom objects.

realpython.com

'Developing.. > Python' 카테고리의 다른 글

Optimization in Python - Linear Programming  (0) 2021.02.05
Multithreading in Python - Qthread  (0) 2021.02.05
Data Handling in Python - MySQL  (0) 2021.02.05
Concurrency in python- Async IO  (0) 2021.02.05
ML module 을 REST api 배포하기  (0) 2021.02.05